1. Construct an example showing that the 0−1 loss function may suffer from local minima; namely, construct a training sample S ∈ (X ×{±1})m (say, for X = R2), for which there exist a vector w and some _ >0 such that
2. For any w_ such that w−w _ ≤_ we have LS(w) ≤ LS (w_) (where the loss here is the 0−1 loss). This means that w is a local minimum of LS .
3. There exists some w∗ such that LS(w∗) <>LS(w). This means that w is not a global minimum of LS .
4. Consider the learning problemof logistic regression: LetH=X ={x∈Rd : x ≤ B}, for some scalar B > 0, let Y = {±1}, and let the loss function be defined a (w, (x, y)) = log(1 + exp ( − y_w,x_)). Show that the resulting learning problem is both convex-Lipschitz-bounded and convex-smooth-bounded. Specify the parameters of Lipschitzness and smoothness.