Predicting Flight Delays (Bootstrap Forest aod Boosted Trees). We return to the
flight delays data for this exercise, and fit both a bootstrap forest and a boosted tree
to the data. Use scheduled departure time (CRS.DEP TIME) rather than the binned
version for these models.
a. Fit a bootstrap forest, with the default settings. Save the formula for this models
to the data table.
i. Look at the column contributions report. Which variables were involved in
the most splits?
ii. What is the error rate on the test set?
b. Fit a boosted tree to the flight delays data, again with the default settings. Save the
formula to the data table.
i. Which variables were involved in the most splits? Is this similiar to wbat you
observed with the bootstrap forest model?
ii. What is the error rate on the test set for this model?
c. Use the Model Comparison platform to compare these models to the final reduced
model found earlier (again, put the validation column in the Group field in the
Model Comparison dialog.
i. Which model has the lowest overall error rate on the test set?
ii. Explain why this model might have the best performance over the other models
you fit.