Corpus: cat_newscrawl_2015_10K

Other corpora

2.2.11 Repetitions

Typical repetitions within words

Subword Length 2 - most frequent words
Subword Word Frequency
en tenen 95
En tenen 95
És empreses 60
és empreses 60
Es empreses 60
es empreses 60
És despeses 11
és despeses 11
Es despeses 11
es despeses 11
Subword Length 2 - Most frequent subwords
Subword Count
es 42
és 42
És 42
Es 42
en 36
En 36
da 17
vi 11
Vi 11
al 9
Amount of words containing repeated subwords of length 2 - per mille
Per mille
5.7986
Subword Length 3 - most frequent words
Subword Word Frequency
bar Bàrbara 1
bar Santa Bàrbara 1
tan instantanis 1
tan instantània 1
mur murmuri 1
Tan instantanis 1
Tan instantània 1
Subword Length 3 - Most frequent subwords
Subword Count
bar 2
tan 2
Tan 2
mur 1
Amount of words containing repeated subwords of length 3 - per mille
Per mille
0.2068
Amount of words containing repeated subwords of length 4 - per mille
Per mille
0.0000
Amount of words containing repeated subwords of length 5 - per mille
Per mille
0.0000
Amount of words containing repeated subwords of length 6 - per mille
Per mille
0.0000
Subword Length 2 - most frequent words with hyphen
Subword Word Frequency
Ll Bell-lloc 2
Ll Vall-llobrega 2
la Castella-la 1
La Castella-la 1
si sí-sí 1
Si sí-sí 1
sí-sí 1
sí-sí 1
Subword Length 2 - Most frequent subwords
Subword Count
Ll 2
la 1
La 1
si 1
Si 1
1
1
Amount of words with hyphen containing repeated subwords of length 2 - per mille
Per mille
0.1372
Amount of words with hyphen containing repeated subwords of length 3 - per mille
Per mille
0.0000
Amount of words with hyphen containing repeated subwords of length 4 - per mille
Per mille
0.0000
Amount of words with hyphen containing repeated subwords of length 5 - per mille
Per mille
0.0000
Amount of words with hyphen containing repeated subwords of length 6 - per mille
Per mille
0.0000
105495 msec needed at 2018-02-05 08:24