Korpus: eus_newscrawl_2012_100K

Weitere Korpora

2.2.11 Repetitions

Typical repetitions within words

Subword Length 2 - most frequent words
Subword Word Frequency
Ar euskararen 304
ar euskararen 304
en lehenengo 284
En lehenengo 284
et betetzen 212
Et betetzen 212
go gogorra 184
Go gogorra 184
go gogoan 167
Go gogoan 167
Subword Length 2 - Most frequent subwords
Subword Count
en 176
En 176
go 127
Go 127
ar 88
Ar 88
et 82
Et 82
ko 67
Ko 67
Amount of words containing repeated subwords of length 2 - per mille
Per mille
11.3692
Subword Length 3 - most frequent words
Subword Word Frequency
ber berbera 44
Ber berbera 44
ber berberak 33
Ber berberak 33
ari kazetariari 19
ari lehendakariari 19
Ari kazetariari 19
Ari lehendakariari 19
ori ondoriorik 15
Bar Barbara 11
Subword Length 3 - Most frequent subwords
Subword Count
ari 29
Ari 29
ber 13
Ber 13
are 12
Are 12
Bar 10
bar 10
txo 8
Txo 8
Amount of words containing repeated subwords of length 3 - per mille
Per mille
1.2708
Subword Length 4 - most frequent words
Subword Word Frequency
aren garenaren 8
bete betebetean 4
Bete betebetean 4
ETAk metaketak 2
ekin etekinekin 2
Ekin etekinekin 2
Etak metaketak 2
aren arruntarenaren 1
aren bankuarenarena 1
aren Erregearenaren 1
Subword Length 4 - Most frequent subwords
Subword Count
aren 9
oren 2
Oren 2
bake 1
Bete 1
Bake 1
ekin 1
Ekin 1
ETAk 1
Etak 1
Amount of words containing repeated subwords of length 4 - per mille
Per mille
0.2606
Amount of words containing repeated subwords of length 5 - per mille
Per mille
0.0000
Amount of words containing repeated subwords of length 6 - per mille
Per mille
0.0000
Subword Length 2 - most frequent words with hyphen
Subword Word Frequency
ia ia-ia 39
Ia ia-ia 39
ia Ia-ia 6
Ia Ia-ia 6
bi bi-biak 5
Bi bi-biak 5
bi Bi-biak 2
Bi Bi-biak 2
bi bi-biek 2
Bi bi-biek 2
Subword Length 2 - Most frequent subwords
Subword Count
en 7
En 7
bi 3
Bi 3
ia 3
Ia 3
Et 2
Ye 2
et 2
re 1
Amount of words with hyphen containing repeated subwords of length 2 - per mille
Per mille
0.2149
Subword Length 3 - most frequent words with hyphen
Subword Word Frequency
bat bat-batean 58
Bat bat-batean 58
den den-dena 18
Den den-dena 18
bat bat-bateko 17
Bat bat-bateko 17
bat Bat-batean 15
oso oso-oso 15
Oso oso-oso 15
Bat Bat-batean 15
Subword Length 3 - Most frequent subwords
Subword Count
bat 8
Bat 8
den 5
Den 5
oso 3
Oso 3
doi 3
adi 2
Adi 2
era 2
Amount of words with hyphen containing repeated subwords of length 3 - per mille
Per mille
0.3778
Subword Length 4 - most frequent words with hyphen
Subword Word Frequency
bete bete-betean 45
Bete bete-betean 45
erdi erdi-erdian 12
Erdi erdi-erdian 12
bizi bizi-bizi 9
Bizi bizi-bizi 9
soil soil-soilik 9
goxo goxo-goxo 8
Goxo goxo-goxo 8
bero bero-bero 7
Subword Length 4 - Most frequent subwords
Subword Count
bizi 5
Bizi 5
gaur 4
Gaur 4
bete 4
Bete 4
ariñ 2
Beñe 2
goiz 2
beti 2
Amount of words with hyphen containing repeated subwords of length 4 - per mille
Per mille
0.8890
Subword Length 5 - most frequent words with hyphen
Subword Word Frequency
behin behin-behineko 19
Behin behin-behineko 19
Zuzen zuzen-zuzenean 15
zuzen zuzen-zuzenean 15
banan banan-banan 14
bañan banan-banan 14
Banan banan-banan 14
Bakar bakar-bakarrik 12
bakar bakar-bakarrik 12
barra barra-barra 11
Subword Length 5 - Most frequent subwords
Subword Count
Behar 5
barru 5
guzti 5
Guzti 5
behin 5
behar 5
Behin 5
zuzen 4
txiki 4
Zuzen 4
Amount of words with hyphen containing repeated subwords of length 5 - per mille
Per mille
2.3047
Subword Length 6 - most frequent words with hyphen
Subword Word Frequency
Poliki poliki-poliki 24
poliki poliki-poliki 24
apurka apurka-apurka 11
berdin berdin-berdin 11
Apurka apurka-apurka 11
Berdin berdin-berdin 11
Gehien gehien-gehienak 10
gehien gehien-gehienak 10
poliki Poliki-poliki 9
urtero urtero-urtero 9
Subword Length 6 - Most frequent subwords
Subword Count
Barren 4
barren 4
berdin 3
Berdin 3
gehien 3
Gehien 3
aldian 2
apurka 2
Aldian 2
Apurka 2
Amount of words with hyphen containing repeated subwords of length 6 - per mille
Per mille
2.0115
998329 msec needed at 2018-02-28 08:00