Corpus: nld_newscrawl_2019_1M

Other corpora

2.2.11 Repetitions

Typical repetitions within words

Subword Length 2 - most frequent words
Subword Word Frequency
gegeven 2437
ge gegeven 2437
Ge gegeven 2437
Is crisis 1281
ís crisis 1281
is crisis 1281
deden 1174
de deden 1174
deden 1174
deden 1174
Subword Length 2 - Most frequent subwords
Subword Count
en 250
En 250
én 250
èn 250
Én 250
re 58
Re 58
ge 54
Ge 54
54
Amount of words containing repeated subwords of length 2 - per mille
Per mille
7.7521
Subword Length 3 - most frequent words
Subword Word Frequency
bar Barbara 159
Bar Barbara 159
oor vooroordelen 138
Oor vooroordelen 138
ten tenten 127
Ten tenten 127
Tom TomTom 68
ten assistenten 55
Ten assistenten 55
eed meedeed 55
Subword Length 3 - Most frequent subwords
Subword Count
ver 10
ten 10
Ver 10
Ten 10
ing 8
Tis 7
oor 5
Oor 5
bar 4
Bar 4
Amount of words containing repeated subwords of length 3 - per mille
Per mille
1.1052
Subword Length 4 - most frequent words
Subword Word Frequency
maat klimaatmaatregelen 29
Maat klimaatmaatregelen 29
koek Koekkoek 11
Koek Koekkoek 11
Cher rechercheren 6
haha hahahaha 5
Haha hahahaha 5
Subword Length 4 - Most frequent subwords
Subword Count
maat 1
Maat 1
koek 1
Koek 1
Cher 1
haha 1
Haha 1
Amount of words containing repeated subwords of length 4 - per mille
Per mille
0.0696
Subword Length 5 - most frequent words
Subword Word Frequency
Fifty fiftyfifty 7
rende renderende 6
Subword Length 5 - Most frequent subwords
Subword Count
Fifty 1
rende 1
Amount of words containing repeated subwords of length 5 - per mille
Per mille
0.0561
Amount of words containing repeated subwords of length 6 - per mille
Per mille
0.0000
Subword Length 2 - most frequent words with hyphen
Subword Word Frequency
to Cito-toets 14
To Cito-toets 14
Co Arco-coöperanten 9
co Arco-coöperanten 9
No no-nonsense 8
no no-nonsense 8
am Amsterdam-Amstelland 7
Am Amsterdam-Amstelland 7
pa Europa-Park 4
Pa Europa-Park 4
Subword Length 2 - Most frequent subwords
Subword Count
to 1
To 1
Co 1
co 1
No 1
no 1
am 1
Am 1
pa 1
Pa 1
Amount of words with hyphen containing repeated subwords of length 2 - per mille
Per mille
0.0633
Subword Length 3 - most frequent words with hyphen
Subword Word Frequency
Pro VPRO-programma 13
pro VPRO-programma 13
PrO VPRO-programma 13
win win-winsituatie 12
Win win-winsituatie 12
win win-win 10
Win win-win 10
via via-via 5
Via via-via 5
win win-win-situatie 5
Subword Length 3 - Most frequent subwords
Subword Count
win 3
Win 3
Pro 1
pro 1
PrO 1
via 1
Via 1
Amount of words with hyphen containing repeated subwords of length 3 - per mille
Per mille
0.0628
Amount of words with hyphen containing repeated subwords of length 4 - per mille
Per mille
0.0000
Subword Length 5 - most frequent words with hyphen
Subword Word Frequency
Fifty fifty-fifty 6
Subword Length 5 - Most frequent subwords
Subword Count
Fifty 1
Amount of words with hyphen containing repeated subwords of length 5 - per mille
Per mille
0.0280
Amount of words with hyphen containing repeated subwords of length 6 - per mille
Per mille
0.0000
1301817 msec needed at 2022-01-09 19:22