Data Mining, Machine Learning and Text Analytics

Part 1: Data Mining, Machine Learning and Text Analytics (50%) If a solution is too difficult to describe, provide screenshot(s) to demonstrate your solution. Exercise 1: Data Mining and Machine Learning (30%) You work with the PVA_PARTITION data set in this Exercise. It contains data that represent charitable donations made to a veterans’ organization. The data represent the results of a mail campaign to solicit donations. Solicitations involve sending a small gift to an individual and include a request for a donation. The data set contains the following information: • a flag to indicate respondents to the appeal (Target Gift Flag) and the dollar amount of their donations (Target Gift Amount) • respondents’ PVA promotion and giving history • demographic data of the respondents 1. Using SAS Visual Analytics a. Sign in to SAS Visual Analytics. b. Select Explore and Visualize Data to begin accessing and exploring the data. c. Select the PVA_PARTITION data source. d. Select the Data pane on the left of the canvas (if it is not open). 1) Which level of the Status Category 96NK variable has the highest count? _____________ 2) Does the variable Age contain any missing values? If so how many? ____________________________ 3) What is the average of Target Gift Amount? _________________________________ e. Change Target Gift Flag from a measure to a category. It is a binary indicator that represents a response to a mailing, where 1 indicates that customers did respond. 1) How are responders and non-responders distributed in the data?__________________ 2) How many females responded to the campaign?________________ f. Save the report. Click (Menu) and select Save As. Save the report in My Folder ð Analytics Toolbox with the name Exercise 1. Click Save. page 2 Continue to work with the PVA_PARTITION data set to train a neural network model. The model aims to classify those customers who made a donation. 2. Training a Neural Network Model in SAS Visual Data Mining and Machine Learning a. Open your saved report, Exercise 1, which was created above in 1. b. Select the Data pane on the left of the canvas and open the PVA_PARTITION data source. If you have not done so already, in the Measure column, right-click Target Gift Flag and select Category. c. Create a new page. d. Add a neural network to the canvas. e. Disable auto-refresh on the menu bar (if not done already). f. Add Target Gift Flag as the response. g. Under Predictors, click Add. In the Add Data Items window, select all predictor variables except for these five: • Control Number • Demographic Cluster • Partition • Target Gift Amount • Target Gift Amount with Zero (In all, you add 24 predictors.) h. Create the neural network model by clicking Refresh or enabling auto-refresh. • How many observations are used by algorithm? • Why all observations are not used by algorithm? • What is the misclassification rate for the model created with default settings? i. Select the Options pane on the right and change Optimization Method to SGD. Do you see any improvement in the misclassification rate? j. Perform honest assessment and examine the results. 1) Select the Data pane on the left of the canvas and set the Partition variable as a new partition. 2) Select the Roles pane on the right of the canvas and assign the Partition variable under the Partition ID role. Refresh the model and note the validation misclassification rate. 3) Select the Options pane and change the L2 regularization parameter value to 0.001. Under Hidden Layers, change Number of Hidden Layers property to 2. Do these changes result in any improvement in the validation misclassification rate statistics? 4) Examine the validation cumulative lift chart. What can you determine about the top 10% (percentile) of the data? How does this model compare to the Best model?

find the cost of your paper

You work for a company that develops and markets crew planning and optimization software and provides training and consulting in this area for the airline industry.

You work for a company that develops and markets crew planning and optimization software and provides training and consulting in this area for the airline industry. Currently, the firm is….

How would education for sustainable development and agriculture affect Peterborough transit and Peterborough Farmers’ Market.

Following a critical analysis of the data, two themes emerged: (i) education for sustainable development; and (ii) sustainable agriculture. The theme of Education for Sustainable Development (ESD) has been presented….

Are there any productivity concerns with allowing employees to use their personal devices at work?

When Saman Rajaee resigned from his sales position at Design Tech Homes in Texas, he wasn’t prepared for the next move the company made.89 He used his personal iPhone to….