Insurance companies base their premiums on many factors, but basically all the factors are variables that predict life expectancy. Life expectancy varies from state to state. Here’s a regression that models Life Expectancy in terms of other demographic variables. The variables are Murder rate per 100,000, HighSchool Graduation rate in %, Income per capita in dollars, Illiteracy rate per 1000, and Life Expectancy in years.
a) The state with the highest leverage and largest Cook’s Distance is Alaska. It is plotted with an x in the residuals plot. Here are a scatterplot of the residuals, a Normal probability plot of the leverage values, and a histogram of Cook’s Distance values. What evidence do you have from these diagnostic plots that Alaska might be an inf luential point?
Here’s another regression with a dummy variable for Alaska added to the regression model.