CS 5630, Data Mining

Spring 2020 Homework 6

Please work on the data set OJ that is part of the package ISLR and use 123 as the seed for all the necessary parts.

(a) [10pts] Create a training set containing a random sample of 800 observations, and a test set con- taining the remaining observations.

(b) [10pts] Fit a Naive Bayes classifier to the training data with Purchase as the response and the other variables as predictors. What are the training error rate and test error rate?

(c) [10pts] Fit a support vector classifier to the training data in part (a) using cost=0.01 with Purchase as the response and the other variables as predictors. What are the training and test error rates?

(d) [10pts] Use the tune( ) function to select an optimal cost. Consider cost values 0.001, 0.01, 0.1, 1, 10, 100.

(e) [10pts] Compute the training and test error rates using the best model frpm part (d).

(f) [10pts] Repeat parts (c) using a support vector machine with a radial kernel. Use the default value of gamma.

(g) [10pts] Repeat parts (f) using a support vector machine with a polynomial kernel with degree=2.

(h) [10pts] Consider parts (e, f, g), which approach has the best test error rate?

Found something interesting ?

• On-time delivery guarantee
• PhD-level professional writers
• Free Plagiarism Report

• 100% money-back guarantee
• Absolute Privacy & Confidentiality
• High Quality custom-written papers

Related Model Questions

Feel free to peruse our college and university model questions. If any our our assignment tasks interests you, click to place your order. Every paper is written by our professional essay writers from scratch to avoid plagiarism. We guarantee highest quality of work besides delivering your paper on time.

Grab your Discount!

25% Coupon Code: SAVE25
get 25% !!