Imagine a document data set in which the class label is generated by the following hidden function (which is unknown to the analyst and therefore has to be learned by a supervised learner): If a term has an odd number of consonants, then the term is of type 1. Otherwise the term is of type 2. The class label of a document is of type 1, if the majority of the tokens in it are of type 1. Otherwise, the class label is of type 2. For a document collection of this type, would you prefer to use (1) a Bernoulli na¨ıve Bayes classifier, (2) a multinomial na¨ıve Bayes classifier, (3) a nearest-neighbor classifier, or (4) a univariate decision tree? What is the impact of the lexicon size and average document size on various classifiers?
#Sales Offer!| Get upto 25% Off: