Corpus: por-pt_web_2014_1M

Other corpora

2.2.11 Repetitions

Typical repetitions within words

Subword Length 2 - most frequent words
Subword Word Frequency
És portugueses 5944
es portugueses 5944
Es portugueses 5944
és portugueses 5944
És meses 5134
es meses 5134
Es meses 5134
és meses 5134
er querer 3091
Ca comunicação 2913
Subword Length 2 - Most frequent subwords
Subword Count
165
165
ca 165
Ca 165
os 162
Os 162
149
Da 121
121
121
Amount of words containing repeated subwords of length 2 - per mille
Per mille
14.8661
Subword Length 3 - most frequent words
Subword Word Frequency
End dependendo 307
end dependendo 307
End atendendo 184
end atendendo 184
End defendendo 146
Ass assassinato 146
end defendendo 146
ass assassinato 146
bar Bárbara 142
Bar Bárbara 142
Subword Length 3 - Most frequent subwords
Subword Count
Ass 28
ass 28
End 24
end 24
bar 13
Bar 13
tar 5
Tan 5
and 4
And 4
Amount of words containing repeated subwords of length 3 - per mille
Per mille
1.2803
Subword Length 4 - most frequent words
Subword Word Frequency
Blue Bluebluesky 41
blue Bluebluesky 41
ahah ahahahah 40
Ahah ahahahah 40
eheh eheheheh 35
Eheh eheheheh 35
hehe hehehehe 29
ahah Ahahahah 29
Ahah Ahahahah 29
Hehe hehehehe 29
Subword Length 4 - Most frequent subwords
Subword Count
ahah 11
Ahah 11
haha 9
Haha 9
eheh 5
Eheh 5
hehe 5
Hehe 5
Blue 1
rsrs 1
Amount of words containing repeated subwords of length 4 - per mille
Per mille
0.4188
Subword Length 5 - most frequent words
Subword Word Frequency
mente veementemente 35
Mente veementemente 35
altos MinisTremocos&SaltosAltos 20
salto MinisTremocos&SaltosAltos 20
Altos MinisTremocos&SaltosAltos 20
Salto MinisTremocos&SaltosAltos 20
lenga lengalenga 8
Subword Length 5 - Most frequent subwords
Subword Count
mente 1
Mente 1
altos 1
salto 1
Altos 1
Salto 1
lenga 1
Amount of words containing repeated subwords of length 5 - per mille
Per mille
0.1147
Subword Length 6 - most frequent words
Subword Word Frequency
ahahah ahahahahahah 9
Ahahah ahahahahahah 9
Subword Length 6 - Most frequent subwords
Subword Count
ahahah 1
Ahahah 1
Amount of words containing repeated subwords of length 6 - per mille
Per mille
0.1068
Subword Length 2 - most frequent words with hyphen
Subword Word Frequency
te Diverte-te 25
Te Diverte-te 25
Diverte-te 25
Diverte-te 25
Diverte-te 25
te diverte-te 15
Te diverte-te 15
diverte-te 15
diverte-te 15
diverte-te 15
Subword Length 2 - Most frequent subwords
Subword Count
te 4
Te 4
4
4
4
Re 4
4
4
re 4
se 2
Amount of words with hyphen containing repeated subwords of length 2 - per mille
Per mille
0.1805
Subword Length 3 - most frequent words with hyphen
Subword Word Frequency
cai cai-cai 7
caí cai-cai 7
Cai cai-cai 7
Caí cai-cai 7
tau tau-tau 7
Tau tau-tau 7
blá blá-blá 5
bla blá-blá 5
Blá blá-blá 5
Subword Length 3 - Most frequent subwords
Subword Count
cai 1
caí 1
Cai 1
Caí 1
tau 1
Tau 1
blá 1
bla 1
Blá 1
Amount of words with hyphen containing repeated subwords of length 3 - per mille
Per mille
0.0388
Subword Length 4 - most frequent words with hyphen
Subword Word Frequency
Piri piri-piri 15
Vera Primavera-Verão 11
verá Primavera-Verão 11
vera Primavera-Verão 11
Verá Primavera-Verão 11
Vera primavera-verão 8
verá primavera-verão 8
vera primavera-verão 8
Verá primavera-verão 8
Subword Length 4 - Most frequent subwords
Subword Count
Vera 2
verá 2
vera 2
Verá 2
Piri 1
Amount of words with hyphen containing repeated subwords of length 4 - per mille
Per mille
0.0571
Subword Length 5 - most frequent words with hyphen
Subword Word Frequency
assim assim-assim 30
Assim assim-assim 30
lenga lenga-lenga 22
corre corre-corre 5
Corre corre-corre 5
Chupa chupa-chupas 5
chupa chupa-chupas 5
Subword Length 5 - Most frequent subwords
Subword Count
assim 1
Assim 1
lenga 1
Chupa 1
chupa 1
corre 1
Corre 1
Amount of words with hyphen containing repeated subwords of length 5 - per mille
Per mille
0.1529
Amount of words with hyphen containing repeated subwords of length 6 - per mille
Per mille
0.0000
1826251 msec needed at 2018-06-04 03:20