Values for some general parameters
Parameter |
Value |
Number of sentences |
1000000 |
Average sentence length in characters |
107.4255 |
Average sentence length in words |
15.2731 |
Number of distinct word forms |
731650 |
Number of distinct word forms (without multiwords) |
662004 |
Percentage of lower case word forms |
18.4400 |
Number of multi word units |
69646 |
Percentage of multi word units |
9.5190 |
Number of running word forms |
15632893 |
Number of running word forms (without multiwords) |
15232219 |
Percentage of lower case running words |
62.9358 |
Average word form length |
12.0337 |
Average running word length |
5.94330051 |
Percentage of word forms with frequency=1 |
60.9613 |
Number of sentence based co-occurrences |
2543058 |
- minimal likelihood ratio |
6.63 |
- maximal likelihood ratio |
71264.16 |
Number of neighbour based co-occurrences |
438564 |
- minimal likelihood ratio |
3.84 |
- maximal likelihood ratio |
134814.38 |
Average number of sentence based co-occurrences per sentence |
75.3051 |
Average number of neighbour co-occurrences per sentence |
7.2511 |
Most frequent word |
der |
Most frequent word's frequency |
440550 |
13384 msec needed at 2021-05-11 20:00