Values for some general parameters
Parameter |
Value |
Number of sentences |
100000 |
Average sentence length in characters |
120.7688 |
Average sentence length in words |
16.7821 |
Number of distinct word forms |
100764 |
Number of distinct word forms (without multiwords) |
97343 |
Percentage of lower case word forms |
46.9741 |
Number of multi word units |
3421 |
Percentage of multi word units |
3.3951 |
Number of running word forms |
1687123 |
Number of running word forms (without multiwords) |
1676462 |
Percentage of lower case running words |
77.2791 |
Average word form length |
7.6034 |
Average running word length |
6.08923972 |
Percentage of word forms with frequency=1 |
55.0365 |
Number of sentence based co-occurrences |
403720 |
- minimal likelihood ratio |
6.63 |
- maximal likelihood ratio |
7668.77 |
Number of neighbour based co-occurrences |
68396 |
- minimal likelihood ratio |
3.84 |
- maximal likelihood ratio |
12220.21 |
Average number of sentence based co-occurrences per sentence |
46.8170 |
Average number of neighbour co-occurrences per sentence |
6.1807 |
Most frequent word |
yang |
Most frequent word's frequency |
43487 |
1037 msec needed at 2024-10-20 02:00