Korpus: als-sqi_news_2015_300K

Weitere Korpora

2.2.11 Repetitions

Typical repetitions within words

Subword Length 2 - most frequent words
Subword Word Frequency
Te vertete 2855
vertete 2855
te vertete 2855
it vitit 1253
It vitit 1253
et mbetet 991
Is ISIS 516
is ISIS 516
He kthehet 485
he kthehet 485
Subword Length 2 - Most frequent subwords
Subword Count
te 227
Te 227
227
et 118
je 84
Je 84
mi 80
Mi 80
he 71
He 71
Amount of words containing repeated subwords of length 2 - per mille
Per mille
15.5439
Subword Length 3 - most frequent words
Subword Word Frequency
Gje pergjegjesi 358
gje pergjegjesi 358
Gje pergjegjes 135
gje pergjegjes 135
Bar barbare 105
bar barbare 105
vet vetveten 82
Vet vetveten 82
vet vetvete 73
Vet vetvete 73
Subword Length 3 - Most frequent subwords
Subword Count
gje 51
Gje 51
bar 36
Bar 36
all 19
All 19
Ash 10
ash 10
she 9
She 9
Amount of words containing repeated subwords of length 3 - per mille
Per mille
2.7169
Subword Length 4 - most frequent words
Subword Word Frequency
Haha Hahahaha 33
haha Hahahaha 33
Haha hahahaha 21
haha hahahaha 21
Haha hahahahaha 16
haha hahahahaha 16
vete veteveten 13
Vete veteveten 13
Haha hahahahah 9
haha hahahahah 9
Subword Length 4 - Most frequent subwords
Subword Count
Haha 15
haha 15
vete 3
Vete 3
Pupu 2
para 1
Para 1
etj. 1
Bobo 1
bobo 1
Amount of words containing repeated subwords of length 4 - per mille
Per mille
0.4852
Amount of words containing repeated subwords of length 5 - per mille
Per mille
0.0000
Subword Length 6 - most frequent words
Subword Word Frequency
Hahaha hahahahahaha 7
hahaha hahahahahaha 7
hahaha hahahahahahahaha 3
Hahaha Hahahahahaha 3
Hahaha hahahahahahahaha 3
ahahah hahahahahahahaha 3
hahaha Hahahahahaha 3
hahaha hahahahahahahahahahahahaha 2
Hahaha hahahahahahahahahahahahaha 2
ahahah hahahahahahahahahahahahaha 2
Subword Length 6 - Most frequent subwords
Subword Count
hahaha 4
Hahaha 4
ahahah 2
Amount of words containing repeated subwords of length 6 - per mille
Per mille
0.6263
Subword Length 2 - most frequent words with hyphen
Subword Word Frequency
sh ish-shefi 3
Sh ish-shefi 3
ko tragjiko-komik 3
Ko tragjiko-komik 3
sh Ish-shefi 2
qi Turqi-Qipro 2
in Putin-in 2
Al Al-Albani 2
Sh Ish-shefi 2
al Al-Albani 2
Subword Length 2 - Most frequent subwords
Subword Count
bo 3
sh 3
Bo 3
Sh 3
Al 1
al 1
in 1
In 1
qi 1
Qi 1
Amount of words with hyphen containing repeated subwords of length 2 - per mille
Per mille
0.1043
Subword Length 3 - most frequent words with hyphen
Subword Word Frequency
gam gam-gam 9
Gam gam-gam 9
Her her-her 5
her her-her 5
Ham ham-ham 4
ham ham-ham 4
bla bla-bla 3
Bla bla-bla 3
etj etj-etj 2
Etj etj-etj 2
Subword Length 3 - Most frequent subwords
Subword Count
ham 1
Ham 1
bla 1
Bla 1
etj 1
Etj 1
gam 1
Gam 1
her 1
Her 1
Amount of words with hyphen containing repeated subwords of length 3 - per mille
Per mille
0.0647
Subword Length 4 - most frequent words with hyphen
Subword Word Frequency
lloj lloj-lloj 26
Lloj lloj-lloj 26
Cope cope-cope 7
cope cope-cope 7
Copa copa-copa 5
copa copa-copa 5
shum shum-shum 3
gati gati-gati 3
Shum shum-shum 3
vija vija-vija 3
Subword Length 4 - Most frequent subwords
Subword Count
Vija 1
copa 1
fare 1
Copa 1
Fare 1
gati 1
shte 1
Gati 1
llap 1
lara 1
Amount of words with hyphen containing repeated subwords of length 4 - per mille
Per mille
0.2109
Subword Length 5 - most frequent words with hyphen
Subword Word Frequency
shume shume-shume 5
Shume shume-shume 5
SHume shume-shume 5
shumë shume-shume 5
vende vende-vende 3
Vende vende-vende 3
shtet pushtet-shteti 2
fundi fundi-fundit 2
shqip shqip-shqip 2
grupe grupe-grupe 2
Subword Length 5 - Most frequent subwords
Subword Count
SHtet 1
vende 1
shqip 1
Vende 1
Shqip 1
avash 1
Avash 1
fundi 1
Fundi 1
grupe 1
Amount of words with hyphen containing repeated subwords of length 5 - per mille
Per mille
0.3307
Subword Length 6 - most frequent words with hyphen
Subword Word Frequency
ngjyra ngjyra-ngjyra 2
ngadal ngadal-ngadal 2
serbes serbes-serbes 2
Ngjyra ngjyra-ngjyra 2
Subword Length 6 - Most frequent subwords
Subword Count
ngadal 1
ngjyra 1
Ngjyra 1
serbes 1
Amount of words with hyphen containing repeated subwords of length 6 - per mille
Per mille
0.4697
1130056 msec needed at 2018-01-30 22:48