Values for some general parameters
Parameter |
Value |
Number of sentences |
999983 |
Average sentence length in characters |
122.6220 |
Average sentence length in words |
19.4077 |
Number of distinct word forms |
450219 |
Number of distinct word forms (without multiwords) |
387646 |
Percentage of lower case word forms |
45.7204 |
Number of multi word units |
62573 |
Percentage of multi word units |
13.8983 |
Number of running word forms |
19654982 |
Number of running word forms (without multiwords) |
19281598 |
Percentage of lower case running words |
85.7500 |
Average word form length |
8.7574 |
Average running word length |
5.25100243 |
Percentage of word forms with frequency=1 |
52.1520 |
Number of sentence based co-occurrences |
3903920 |
- minimal likelihood ratio |
6.63 |
- maximal likelihood ratio |
168150.66 |
Number of neighbour based co-occurrences |
507146 |
- minimal likelihood ratio |
3.84 |
- maximal likelihood ratio |
307273.12 |
Average number of sentence based co-occurrences per sentence |
142.2964 |
Average number of neighbour co-occurrences per sentence |
12.0286 |
Most frequent word |
de |
Most frequent word's frequency |
1019896 |
13548 msec needed at 2024-09-13 13:00