Korpus: cat_newscrawl_2011_100K

Weitere Korpora

2.2.11 Repetitions

Typical repetitions within words

Subword Length 2 - most frequent words
Subword Word Frequency
En tenen 1170
en tenen 1170
És empreses 567
es empreses 567
Es empreses 567
ès empreses 567
és empreses 567
Re darrere 205
ar declarar 167
Re enrere 153
Subword Length 2 - Most frequent subwords
Subword Count
es 142
és 142
És 142
Es 142
ès 142
ra 129
Ra 129
ós 106
os 106
Os 106
Amount of words containing repeated subwords of length 2 - per mille
Per mille
11.4953
Subword Length 3 - most frequent words
Subword Word Frequency
ten intenten 34
Ten intenten 34
ret retret 12
ret retrets 12
tan instantània 10
bar Bàrbara 10
Tan instantània 10
Bar Bàrbara 10
bar barbaritats 8
Bar barbaritats 8
Subword Length 3 - Most frequent subwords
Subword Count
bar 13
Bar 13
pos 8
ten 5
Ten 5
tan 5
Tan 5
Txe 3
mur 3
Mur 3
Amount of words containing repeated subwords of length 3 - per mille
Per mille
0.7236
Subword Length 4 - most frequent words
Subword Word Frequency
Cien consciencien 2
date DateDate 1
Subword Length 4 - Most frequent subwords
Subword Count
Cien 1
date 1
Amount of words containing repeated subwords of length 4 - per mille
Per mille
0.0376
Subword Length 5 - most frequent words
Subword Word Frequency
germà GermàGermà 1
Germà GermàGermà 1
Subword Length 5 - Most frequent subwords
Subword Count
germà 1
Germà 1
Amount of words containing repeated subwords of length 5 - per mille
Per mille
0.0355
Amount of words containing repeated subwords of length 6 - per mille
Per mille
0.0000
Subword Length 2 - most frequent words with hyphen
Subword Word Frequency
Ll Bell-lloc 10
ll Bell-lloc 10
ni ni-ni 9
Ni ni-ni 9
la Castella-la 5
La Castella-la 5
Castella-la 5
Ll Vall-llobera 2
Ll Vall-llobrega 2
ll Vall-llobera 2
Subword Length 2 - Most frequent subwords
Subword Count
Ll 6
ll 6
la 4
La 4
4
Co 3
co 3
ço 3
ni 2
Ni 2
Amount of words with hyphen containing repeated subwords of length 2 - per mille
Per mille
0.2195
Subword Length 3 - most frequent words with hyphen
Subword Word Frequency
Xup xup-xup 2
sud Sud-Sudan 1
Sud Sud-Sudan 1
Pio Pio-pio 1
Pío Pio-pio 1
Subword Length 3 - Most frequent subwords
Subword Count
Xup 1
Pio 1
Pío 1
sud 1
Sud 1
Amount of words with hyphen containing repeated subwords of length 3 - per mille
Per mille
0.0381
Subword Length 4 - most frequent words with hyphen
Subword Word Frequency
pica pica-pica 4
Pica pica-pica 4
Vall Afrivall-Vall 1
vall Afrivall-Vall 1
hora hora-hora 1
Hora hora-hora 1
Inde in-inde-independència 1
inde in-inde-independència 1
Subword Length 4 - Most frequent subwords
Subword Count
pica 1
Pica 1
Vall 1
vall 1
hora 1
Hora 1
Inde 1
inde 1
Amount of words with hyphen containing repeated subwords of length 4 - per mille
Per mille
0.0752
Subword Length 5 - most frequent words with hyphen
Subword Word Frequency
doble doble-doble 5
Doble doble-doble 5
Berga Berga-Berga 2
bisbe arquebisbe-bisbe 1
Bisbe arquebisbe-bisbe 1
Subword Length 5 - Most frequent subwords
Subword Count
doble 1
Doble 1
Berga 1
bisbe 1
Bisbe 1
Amount of words with hyphen containing repeated subwords of length 5 - per mille
Per mille
0.1064
Subword Length 6 - most frequent words with hyphen
Subword Word Frequency
Ferràs FERRAS-FERRAS 1
Subword Length 6 - Most frequent subwords
Subword Count
Ferràs 1
Amount of words with hyphen containing repeated subwords of length 6 - per mille
Per mille
0.0832
1097540 msec needed at 2018-02-05 03:23