## What parameter of the regression model is being estimated by a Root MSE?

The questions are based on Lecture Notes 2 and Lecture Notes 1. Please write in the first person singular (“I”). Show active engagement with the lecture notes. Submit all text, graphs and SAS output in a single Word document. Strive for readability; imagine that someone might actually read what you have written.

1. The following output was sourced and shown in Problem #2 of LN2.C GroupWork 2

Net

Obs Group _MODEL_ _TYPE_ _DEPVAR_ _RMSE_ Intercept Count Wt

1 2 MODEL1 PARMS NetWt 1.52202 25.2154 1.28176 -1

2 3 MODEL1 PARMS NetWt 0.94023 27.2769 1.16531 -1

3 5 MODEL1 PARMS NetWt 0.96081 17.6571 1.65238 -1

4 6 MODEL1 PARMS NetWt 1.01435 19.1121 1.59741 -1

5 7 MODEL1 PARMS NetWt 1.53226 26.1459 1.22875 -1

6 8 MODEL1 PARMS NetWt 1.09972 28.7744 1.11778 -1

7 9 MODEL1 PARMS NetWt 0.99709 22.1760 1.42708 -1

8 10 MODEL1 PARMS NetWt 1.10568 26.5912 1.18456 -1

(a) Note the presence of the column named _RMSE_. RMSE stands for Root MSE. What is Root MSE and how is it computed?

(b) What parameter of the regression model is being estimated by a Root MSE?

(c) From what daughter population are the 8 values in the _RMSE_ coming from?

(d) Surprisingly, the grand average of the daughter population in the previous question does not equal the corresponding population parameter. We say, then, that the RMSE estimator has what characteristic?

(e) Recall that the Root MSE is the square root of something. What is that thing called? Where is its numeric value found in the regression output on page 11? What parameter does it estimate? Under what conditions would we say this estimator is unbiased (The general topic of “unbiased” is discussed in the univariate case for the sample average on page 5 of LN1.A. Here, you are not dealing with the univariate case or the sample average!)

2. Go to the Student Stock Assignment file and find your stock. Following the provided instructions, download the daily stock history for 1999 for your stock and for the S&P500 (ticker symbol ^GSPC). Compute the daily percentage return as shown. Using my SAS program 2.4 from page 10 of LN2.A as your template, regress the daily returns for 2019 for your stock on the daily returns on the S&P500. (Warning, it is not safe to put special character symbols into SAS, so don’t put the symbol “&” in your program. Actually, the & symbol is a powerful tool in SAS.)

(a) Show the graphical output.

(b) Insert a vertical line for an arbitrary x-value. That line cuts the upper and lower 95% prediction lines provided by SAS in the graph. What are the approximate values of the lower and upper endpoints of the 95% prediction interval?

(c) What is the y being “predicted,” above. The units are percent, but percent of what?

(d) Your line also crossed the darker region, creating a shorter interval. What are the approximate endpoints of that interval? What is the name of that interval?

(e) What is being estimated by that interval?

(f) The vertical line also crossed the y-hat line. What is the value of y-hat at that intersection?

(g) What population characteristic is being estimated by the y-hat value, above?

