Univenity RankiDgs. The dataset on American college and university rankings
( ) contains information on 1302 American colleges and universities
offering an undergraduate program. For each university there are 17 measurements
that include continuous measurements (e.g., tuition and graduation rate) and categorical
measurements (e.g., location by state and whether it is a private or a public
school).
a. Make sure the variables are coded correctly in IMP (Nominal, Ordinal, or Continuow),
then use the Columns Viewer to summarize the data. Am there any missing
values? How many Nominal columns are there?
b. Conduct a principal components analysis on the data and comment on the results.
Recall that, by default, IMP will conduct the analysis on correlations rather than
covariances. Is this necessary? Do the data need to be normalized in this case?
Discuss key considerations in this decision.
4A Sales of Toyota Corona Cars. The file contains data on
used cars (Toyota Corollas) on sale during late summer of 2004 in the Netherlands.
The data table has 1436 records containing details on 38 attributes, including Price,
Age, Kilome~rs, HP, and some categorical. and dummy-coded variables. The ultimate
goal will be to predict the price of a used Toyota Corolla based on its speci:fi.caIions.
Although special coWng of categorical variables in IMP is generally not required. in
this exercise we explore how to create dummy variables (and why).
L Identify the categorical. variables.
b. Which variables have been dummy coded?
Co Consider the variable FuelJ’ype. How many binary dummy variables are required
to capture the information for this variable?
d. Use the Make Indicator Variable.J option under Cola Utilitie.J to convert the
Fuel Type into dummy variables, then change the Modeling Types for these
dummy variables to Conlinllous. Explain in words the values in the derived binary
dummies.