Values for some general parameters
parameter |
value |
number of sentences |
100000 |
average sentence length in characters |
154.3983 |
average sentence length in words |
14.7836 |
number of distinct word forms |
401029 |
percentage of lower case word forms |
94.0862 |
percentage of multi word units |
0.1588 |
number of running word forms |
1892991 |
percentage of lower case running words |
97.6969 |
average word form length |
12.4843 |
average running word length |
9.38975678 |
percentage of word forms with frequency=1 |
71.7444 |
number of sentence based co-occurrences |
121814 |
minimal likelihood ratio |
6.63 |
maximal likelihood ratio |
2190.04 |
number of neighbour based co-occurrences |
22832 |
minimal likelihood ratio |
3.84 |
maximal likelihood ratio |
9697.99 |
average number of sentence based co-occurrences per sentence |
8.3532 |
average number of neighbour co-occurrences per sentence |
1.5121 |
most frequent word |
있다 |
frequent word's frequency |
14206 |
3278 msec needed at 2017-12-30 01:32