Corpus: pol_wikipedia_2007_300K

Other corpora

2.2.11 Repetitions

Typical repetitions within words

Subword Length 2 - most frequent words
Subword Word Frequency
o? po?o?ona 7043
o? po?o?one 567
na znana 425
Na znana 425
a? dzia?a? 417
o? po?o?ony 362
a? dzia?a? 274
tu Instytutu 229
Tu Instytutu 229
Ci w?a?cicielem 188
Subword Length 2 - Most frequent subwords
Subword Count
o? 32
a? 32
Li 28
li 28
ni 23
Ni 23
ci 18
Ci 18
na 17
Na 17
Amount of words containing repeated subwords of length 2 - per mille
Per mille
3.7653
Subword Length 3 - most frequent words
Subword Word Frequency
nie Ujednoznacznienie 983
Nie Ujednoznacznienie 983
nie istnienie 193
Nie istnienie 193
nie ci?nienie 94
Nie ci?nienie 94
Bar Barbary 55
bar Barbary 55
Bar Barbara 54
bar Barbara 54
Subword Length 3 - Most frequent subwords
Subword Count
nie 90
Nie 90
Bar 10
bar 10
\to 3
Cho 3
nó? 3
Nó? 3
ach 2
ani 2
Amount of words containing repeated subwords of length 3 - per mille
Per mille
1.4627
Amount of words containing repeated subwords of length 4 - per mille
Per mille
0.0000
Amount of words containing repeated subwords of length 5 - per mille
Per mille
0.0000
Amount of words containing repeated subwords of length 6 - per mille
Per mille
0.0000
Subword Length 2 - most frequent words with hyphen
Subword Word Frequency
ko k?dzierzy?sko-kozielskim 12
Xi XI-XII 7
Ka Skar?yska-Kamiennej 5
ka Skar?yska-Kamiennej 5
ko kaszubsko-kociewskiego 4
Pa Pa-Pa-Pa-Pa-Puffy 3
Subword Length 2 - Most frequent subwords
Subword Count
ko 2
Xi 1
Ka 1
ka 1
Pa 1
Amount of words with hyphen containing repeated subwords of length 2 - per mille
Per mille
0.0524
Amount of words with hyphen containing repeated subwords of length 3 - per mille
Per mille
0.0000
Amount of words with hyphen containing repeated subwords of length 4 - per mille
Per mille
0.0000
Subword Length 5 - most frequent words with hyphen
Subword Word Frequency
Baden Baden-Baden 5
kilku kilku-kilkunastu 4
Kilku kilku-kilkunastu 4
Subword Length 5 - Most frequent subwords
Subword Count
Baden 1
kilku 1
Kilku 1
Amount of words with hyphen containing repeated subwords of length 5 - per mille
Per mille
0.0673
Subword Length 6 - most frequent words with hyphen
Subword Word Frequency
w?giel w?giel-w?giel 8
W?giel w?giel-w?giel 8
Subword Length 6 - Most frequent subwords
Subword Count
w?giel 1
W?giel 1
Amount of words with hyphen containing repeated subwords of length 6 - per mille
Per mille
0.0830
2002837 msec needed at 2018-01-13 09:28