Korpus: koi_wikipedia_2021

Weitere Korpora

1.1 Summary

Values for some general parameters

Parameter Value
Number of sentences 9152
Average sentence length in characters 114.5800
Average sentence length in words 9.5378
Number of distinct word forms 22964
Number of distinct word forms (without multiwords) 22964
Percentage of lower case word forms 66.8612
Number of multi word units 0
Percentage of multi word units 0.0000
Number of running word forms 84356
Number of running word forms (without multiwords) 84356
Percentage of lower case running words 77.8617
Average word form length 14.3050
Average running word length 11.17002940
Percentage of word forms with frequency=1 66.3517
Number of sentence based co-occurrences 17928
- minimal likelihood ratio 6.63
- maximal likelihood ratio 3749.15
Number of neighbour based co-occurrences 2037
- minimal likelihood ratio 3.85
- maximal likelihood ratio 3782.01
Average number of sentence based co-occurrences per sentence 13.5039
Average number of neighbour co-occurrences per sentence 1.4789
Most frequent word да
Most frequent word's frequency 2092
1114 msec needed at 2021-07-24 13:00