1. Discuss the advantages of rule-based learners over decision trees, when the amount of data is limited.
2. Discuss how one might integrate domain knowledge with rule-based learners.
3. The bias variable is often addressed in least-squares classification and regression by adding an additional column of 1s to the data. Discuss the differences with the use of an explicit bias term when regularized forms of the model are used.
4. Write the optimization formulation for least-squares regression of the form y = W · X + b with a bias term b. Do not use regularization. Show that the optimal value of the bias term b always evaluates to 0 when the data matrix D and response variable vector y are both mean-centered.