Korpus: cor_wikipedia_2018

Weitere Korpora

1.1 Summary

Values for some general parameters

Parameter Value
Number of sentences 3844
Average sentence length in characters 90.7159
Average sentence length in words 15.6860
Number of distinct word forms 12287
Number of distinct word forms (without multiwords) 12144
Percentage of lower case word forms 57.7684
Number of multi word units 143
Percentage of multi word units 1.1638
Number of running word forms 60103
Number of running word forms (without multiwords) 59757
Percentage of lower case running words 74.7483
Average word form length 6.8160
Average running word length 4.73141222
Percentage of word forms with frequency=1 59.9333
Number of sentence based co-occurrences 12892
- minimal likelihood ratio 6.63
- maximal likelihood ratio 1063.02
Number of neighbour based co-occurrences 2356
- minimal likelihood ratio 3.87
- maximal likelihood ratio 1826.89
Average number of sentence based co-occurrences per sentence 24.0744
Average number of neighbour co-occurrences per sentence 3.8652
Most frequent word an
Most frequent word's frequency 3006
963 msec needed at 2019-02-11 08:00