Korpus: afr_newscrawl_2013_10K

Weitere Korpora

1.1 Summary

Values for some general parameters

Parameter Value
Number of sentences 10000
Average sentence length in characters 106.6786
Average sentence length in words 17.9022
Number of distinct word forms 26610
Number of distinct word forms (without multiwords) 26256
Percentage of lower case word forms 67.7715
Number of multi word units 354
Percentage of multi word units 1.3303
Number of running word forms 179151
Number of running word forms (without multiwords) 178589
Percentage of lower case running words 87.2945
Average word form length 8.6360
Average running word length 4.85934744
Percentage of word forms with frequency=1 64.8628
Number of sentence based co-occurrences 22494
- minimal likelihood ratio 6.63
- maximal likelihood ratio 5947.41
Number of neighbour based co-occurrences 5857
- minimal likelihood ratio 3.84
- maximal likelihood ratio 3540.78
Average number of sentence based co-occurrences per sentence 36.6230
Average number of neighbour co-occurrences per sentence 4.4033
Most frequent word die
Frequent word's frequency 11494
201 msec needed at 2018-01-30 20:00