Korpus: ind_wikipedia_2007_100K

Weitere Korpora

2.2.11 Repetitions

Typical repetitions within words

Subword Length 2 - most frequent words
Subword Word Frequency
Be beberapa 2382
be beberapa 2382
Du penduduk 1464
du penduduk 1464
Di didirikan 568
di didirikan 568
Di pendidikan 542
di pendidikan 542
Be Beberapa 523
be Beberapa 523
Subword Length 2 - Most frequent subwords
Subword Count
an 225
An 225
si 72
Si 72
Me 53
me 53
Ja 52
Di 48
di 48
is 45
Amount of words containing repeated subwords of length 2 - per mille
Per mille
18.7304
Subword Length 3 - most frequent words
Subword Word Frequency
Gan perdagangan 259
Din dinding 101
kan menekankan 70
Kan menekankan 70
Gin menginginkan 68
Cin cincin 42
Gan Perdagangan 40
tan tantangan 38
Tan tantangan 38
Gan tegangan 37
Subword Length 3 - Most frequent subwords
Subword Count
Gan 16
Jun 7
bar 7
Bar 7
Bär 7
ton 6
Ton 6
Din 6
Cin 4
Mar 4
Amount of words containing repeated subwords of length 3 - per mille
Per mille
1.7200
Subword Length 4 - most frequent words
Subword Word Frequency
Gang ganggang 7
gang ganggang 7
Huan Huanhuan 3
Suma Kusumasumantri 2
Jing Jingjing 2
Ying Yingying 2
jing Jingjing 2
Jìng Jingjing 2
bang Bangbang 1
tapi Tapitapitak 1
Subword Length 4 - Most frequent subwords
Subword Count
Gang 2
gang 2
Jìng 1
Tapi 1
Suma 1
pita 1
Ying 1
Pita 1
Bang 1
bang 1
Amount of words containing repeated subwords of length 4 - per mille
Per mille
0.2569
Subword Length 5 - most frequent words
Subword Word Frequency
Ngoro Ngorongoro 1
ngoro Ngorongoro 1
Subword Length 5 - Most frequent subwords
Subword Count
Ngoro 1
ngoro 1
Amount of words containing repeated subwords of length 5 - per mille
Per mille
0.0394
Amount of words containing repeated subwords of length 6 - per mille
Per mille
0.0000
Subword Length 2 - most frequent words with hyphen
Subword Word Frequency
Bu bumbu-bumbu 10
bu bumbu-bumbu 10
Al al-Albani 5
al al-Albani 5
Yo Yo-Yo 4
yo Yo-Yo 4
as asas-asas 3
As asas-asas 3
gi gigi-gigi 3
Gi gigi-gigi 3
Subword Length 2 - Most frequent subwords
Subword Count
Sy 3
si 3
Al 3
Si 3
al 3
yo 2
Ka 2
ka 2
Bu 2
bu 2
Amount of words with hyphen containing repeated subwords of length 2 - per mille
Per mille
0.3502
Subword Length 3 - most frequent words with hyphen
Subword Word Frequency
hal hal-hal 99
Hal hal-hal 99
Hak hak-hak 72
hak hak-hak 72
Sel sel-sel 47
sel sel-sel 47
Apa apa-apa 38
apa apa-apa 38
Abu abu-abu 37
abu abu-abu 37
Subword Length 3 - Most frequent subwords
Subword Count
dua 5
Dua 5
api 5
Api 5
hak 4
Hak 4
Sel 3
Abu 3
abu 3
Abū 3
Amount of words with hyphen containing repeated subwords of length 3 - per mille
Per mille
1.2149
Subword Length 4 - most frequent words with hyphen
Subword Word Frequency
Anak anak-anak 247
anak anak-anak 247
Satu satu-satunya 230
satu satu-satunya 230
Laki laki-laki 184
laki laki-laki 184
sama bersama-sama 162
Sama bersama-sama 162
Samā bersama-sama 162
Lain lain-lain 159
Subword Length 4 - Most frequent subwords
Subword Count
Hari 8
hari 8
kata 7
Kata 7
anak 6
cita 6
Anak 6
Cita 6
Laki 5
Daun 5
Amount of words with hyphen containing repeated subwords of length 4 - per mille
Per mille
8.4769
Subword Length 5 - most frequent words with hyphen
Subword Word Frequency
Orang orang-orang 598
orang orang-orang 598
Benar benar-benar 134
benar benar-benar 134
Besar besar-besaran 86
besar besar-besaran 86
Tahun tahun-tahun 82
tahun tahun-tahun 82
karya karya-karya 77
Karya karya-karya 77
Subword Length 5 - Most frequent subwords
Subword Count
lebih 7
Lebih 7
kisah 5
tahun 5
bulan 5
Kisah 5
Tahun 5
Bulan 5
murid 5
Murid 5
Amount of words with hyphen containing repeated subwords of length 5 - per mille
Per mille
21.4812
Subword Length 6 - most frequent words with hyphen
Subword Word Frequency
Masing masing-masing 399
masing masing-masing 399
Negara negara-negara 258
negara negara-negara 258
undang undang-undang 112
Undang undang-undang 112
kadang kadang-kadang 102
Kadang kadang-kadang 102
Bahasa bahasa-bahasa 81
bahasa bahasa-bahasa 81
Subword Length 6 - Most frequent subwords
Subword Count
Undang 5
Tengah 5
tengah 5
undang 5
negara 4
cerita 4
Negara 4
Cerita 4
negeri 3
sketsa 3
Amount of words with hyphen containing repeated subwords of length 6 - per mille
Per mille
27.4746
2413782 msec needed at 2017-12-25 11:18