This problem uses an expanded version of the Boston Housing data. This version of the data contains five predictor variables for MEDV. The five variables are:
CRIM—crime rate per 1000 persons
NOX—nitric oxide concentration (parts per 10 million)
RM—average number of rooms per dwelling
PTRATIO—pupil–teacher ratio by town
LSTAT—% lower status of the population
(a) Make a scatterplot of MEDV versus CRIM. What do you see? On the basis of your scatterplot, does CRIM appear helpful in predicting MEDV?
(b) Run a regression of MEDV as a function of CRIM. Report a p-value and interpret it. From this, does CRIM appear helpful in predicting MEDV?
(c) Make a scatterplot of MEDV versus LSTAT. What do you see? On the basis of your scatterplot, does LSTAT appear helpful in predicting MEDV?
(d) Run a regression of MEDV as a function of LSTAT. Report a p-value and interpret it. From this, does LSTAT appear helpful in predicting MEDV?
(e) So far, we have been analyzing one predictor variable at a time so we do not have a good idea how they work together to affect MEDV. Run a multiple regression of MEDV versus all five of the predictor variables. Report the regression equation and the predictor variable p-values from the computer output. From this, does the regression appear helpful in predicting MEDV?