Use your program to generate the following information:
1.The number of word tokens in the database;
2.The number of unique words in the database;
3.The number of words that occur only once in the database;
4.For 30 most frequent words in the database, provide: TF, IDF, TF*IDF and probabilities.
(TF: term frequency
IDF:inverse document frequency)
5.The average number of word tokens per document.