Corpus: spa_news_2009_300K

Other corpora

2.2.11 Repetitions

Typical repetitions within words

Subword Length 2 - most frequent words
Subword Word Frequency
En tienen 3626
en tienen 3626
Is crisis 3208
is crisis 3208
Es meses 2705
es meses 2705
an mañana 1583
An mañana 1583
na mañana 1583
Na mañana 1583
Subword Length 2 - Most frequent subwords
Subword Count
Ra 169
os 126
Os 126
da 99
Da 99
99
es 98
Es 98
vi 79
Vi 79
Amount of words containing repeated subwords of length 2 - per mille
Per mille
12.8456
Subword Length 3 - most frequent words
Subword Word Frequency
Hua Chihuahua 71
tan instantánea 37
Tan instantánea 37
bar Bárbara 36
Bar Bárbara 36
Cha muchacha 34
cha muchacha 34
chá muchacha 34
mis mismísimo 32
bar barbaridad 32
Subword Length 3 - Most frequent subwords
Subword Count
bar 10
Bar 10
and 7
And 7
tan 5
Tan 5
chi 3
Che 3
Cha 3
che 3
Amount of words containing repeated subwords of length 3 - per mille
Per mille
0.6322
Subword Length 4 - most frequent words
Subword Word Frequency
jaja jajajaja 5
Jaja jajajaja 5
Jajá jajajaja 5
jaja jajajajaja 2
jaja jajajajajaja 2
Jaja jajajajaja 2
Jaja jajajajajaja 2
Jajá jajajajaja 2
Jajá jajajajajaja 2
Subword Length 4 - Most frequent subwords
Subword Count
jaja 3
Jaja 3
Jajá 3
Amount of words containing repeated subwords of length 4 - per mille
Per mille
0.0562
Subword Length 5 - most frequent words
Subword Word Frequency
mente vehementemente 3
Subword Length 5 - Most frequent subwords
Subword Count
mente 1
Amount of words containing repeated subwords of length 5 - per mille
Per mille
0.0359
Subword Length 6 - most frequent words
Subword Word Frequency
jajaja jajajajajaja 2
Jajaja jajajajajaja 2
Subword Length 6 - Most frequent subwords
Subword Count
jajaja 1
Jajaja 1
Amount of words containing repeated subwords of length 6 - per mille
Per mille
0.0885
Subword Length 2 - most frequent words with hyphen
Subword Word Frequency
la Castilla-La 76
la Castilla-La Mancha 76
La Castilla-La 76
La Castilla-La Mancha 76
Castilla-La 76
Castilla-La Mancha 76
Castilla-La 76
Castilla-La Mancha 76
ni ni-ni 4
Ni ni-ni 4
Subword Length 2 - Most frequent subwords
Subword Count
yo 2
Yo 2
la 2
La 2
2
2
ni 1
Ni 1
re 1
Re 1
Amount of words with hyphen containing repeated subwords of length 2 - per mille
Per mille
0.0631
Subword Length 3 - most frequent words with hyphen
Subword Word Frequency
sur Sur-Sur 6
Sur Sur-Sur 6
Subword Length 3 - Most frequent subwords
Subword Count
sur 1
Sur 1
Amount of words with hyphen containing repeated subwords of length 3 - per mille
Per mille
0.0126
Subword Length 4 - most frequent words with hyphen
Subword Word Frequency
Colo Colo-Colo 5
coló Colo-Colo 5
verá primavera-verano 4
Vera primavera-verano 4
vera primavera-verano 4
Verá primavera-verano 4
mata mata-mata 2
Mata mata-mata 2
Subword Length 4 - Most frequent subwords
Subword Count
Colo 1
coló 1
verá 1
Vera 1
vera 1
Verá 1
mata 1
Mata 1
Amount of words with hyphen containing repeated subwords of length 4 - per mille
Per mille
0.0562
Subword Length 5 - most frequent words with hyphen
Subword Word Frequency
doble doble-doble 7
Doble doble-doble 7
Subword Length 5 - Most frequent subwords
Subword Count
doble 1
Doble 1
Amount of words with hyphen containing repeated subwords of length 5 - per mille
Per mille
0.0359
Amount of words with hyphen containing repeated subwords of length 6 - per mille
Per mille
0.0000
1163644 msec needed at 2018-03-24 22:59