Values for some general parameters
Parameter |
Value |
Number of sentences |
10000 |
Average sentence length in characters |
114.7498 |
Average sentence length in words |
14.7308 |
Number of distinct word forms |
35258 |
Number of distinct word forms (without multiwords) |
35145 |
Percentage of lower case word forms |
78.0617 |
Number of multi word units |
113 |
Percentage of multi word units |
0.3205 |
Number of running word forms |
147427 |
Number of running word forms (without multiwords) |
147235 |
Percentage of lower case running words |
86.1362 |
Average word form length |
8.9513 |
Average running word length |
6.67242843 |
Percentage of word forms with frequency=1 |
68.1179 |
Number of sentence based co-occurrences |
27922 |
- minimal likelihood ratio |
6.63 |
- maximal likelihood ratio |
5315.47 |
Number of neighbour based co-occurrences |
3873 |
- minimal likelihood ratio |
3.85 |
- maximal likelihood ratio |
16296.71 |
Average number of sentence based co-occurrences per sentence |
27.8206 |
Average number of neighbour co-occurrences per sentence |
3.3340 |
Most frequent word |
kuti |
Frequent word's frequency |
3337 |
385 msec needed at 2018-06-16 09:50