Korpus: deu_news_2014_300K

Weitere Korpora

1.1 Summary

Values for some general parameters

Parameter Value
Number of sentences 296094
Average sentence length in characters 115.2564
Average sentence length in words 16.2973
Number of distinct word forms 385413
Number of distinct word forms (without multiwords) 348731
Percentage of lower case word forms 22.8072
Number of multi word units 36682
Percentage of multi word units 9.5176
Number of running word forms 5020676
Number of running word forms (without multiwords) 4807793
Percentage of lower case running words 61.9541
Average word form length 11.3938
Average running word length 6.00151878
Percentage of word forms with frequency=1 62.4953
Number of sentence based co-occurrences 874790
- minimal likelihood ratio 6.63
- maximal likelihood ratio 24491.44
Number of neighbour based co-occurrences 162060
- minimal likelihood ratio 3.84
- maximal likelihood ratio 39482.59
Average number of sentence based co-occurrences per sentence 63.4594
Average number of neighbour co-occurrences per sentence 6.5946
Most frequent word der
Frequent word's frequency 133113
2694 msec needed at 2018-02-17 20:23