Exercises for Chapter 3 (The school data is in the attachment)
This exercise utilizes the data set schools-a.sav, which can be downloaded from this website:
www.routledge.com/9781138289734
1.You are interested in investigating if being above or below the median income (medloinc) impacts ACT means (act94) for schools. Complete the necessary steps to examine univariate grouped data in order to respond to the questions below. Although deletions and/or transformations may be implied from your examination, all steps will examine original variables.
a.How many participants have missing values for medloinc and act94?
b.Is there a severe split in frequencies between groups?
c.What are the cutoff values for outliers in each group?
d.Which outlying cases should be deleted for each group?
e.Analyzing histograms, normal Q-Q plots, and tests of normality, what is your conclusion regarding normality? If a transformation is necessary, which one would you use?
f.Do the results from Levene’s test for equal variances indicate homogeneity of variance? Explain.
2.Examination of the variable of scienc93 indicates a substantial to severe positively skewed distribution. Transform this variable using the two most appropriate methods. After examining the distributions for these transformed variables, which produced the better alteration?
3.You are interested in studying predictors (math94me, loinc93, and read94me) of the percentage graduating in 1994 (grad94).
a.Examine univariate normality for each variable. What are your conclusions about the distributions? What transformations should be conducted?
b.After making the necessary transformations, examine multivariate outliers using Mahalanobis distance. Which cases should be deleted?
c.After deleting the multivariate outliers, examine multivariate normality and linearity by creating a Scatterplot Matrix.
d.Examine the variables for homoscedasticity by creating a residuals plot (standardized vs. predicted values). What are your conclusions about homoscedasticity?
Notes
1 The mathematical equation for kurtosis gives a value of 3 when the distribution is normal, but statistical packages subtract 3 before printing so that the expected value is equal to zero.
2 By default, SPSS displays variable labels. To display variable names as shown here, go to the Edit, Options, General tab and, under Variable Lists, check Display names. Click on the Output Labels tab, then select Names, both from Variables in item labels shown as in the Outline Labeling box and from Variables in labels shown as in the Pivot Table Labeling box.
3 The steps illustrated in Figure 3.7 are for illustrative purposes only and are not performed in our example. Continue working on our example from outliers on the next page.
4 At this point, a new data set named career-b.sav was created, reflecting all transformations and recoding of variables performed thus far in this chapter upon the original career-a.sav data set.
5 Data set career-c.sav on website created, at this point, for reference.
Notes
1 The mathematical equation for kurtosis gives a value of 3 when the distribution is normal, but statistical packages subtract 3 before printing so that the expected value is equal to zero.
2 By default, SPSS displays variable labels. To display variable names as shown here, go to the Edit, Options, General tab and, under Variable Lists, check Display names. Click on the Output Labels tab, then select Names, both from Variables in item labels shown as in the Outline Labeling box and from Variables in labels shown as in the Pivot Table Labeling box.
3 The steps illustrated in Figure 3.7 are for illustrative purposes only and are not performed in our example. Continue working on our example from outliers on the next page.
4 At this point, a new data set named career-b.sav was created, reflecting all transformations and recoding of variables performed thus far in this chapter upon the original career-a.sav data set.
5 Data set career-c.sav on website created, at this point, for reference.