1. Show that the resulting learning problem is convex-Lipschitz-bounded.
2. Show that no computable algorithm can learn the problem.
3. From Bounded Expected Risk to Agnostic PAC Learning: Let A be an algorithm that guarantees the following: If m ≥ mH(_) then for every distribution D it holds that E S∼Dm [LD(A(S))]≤ min h∈H LD(h)+_. _ Show that for every δ ∈ (0, 1), if m ≥ mH(_ δ) then with probability of at least 1−δ it holds that LD(A(S))≤ minh∈H LD(h)+_. Hint: Observe that the random variable LD(A(S))−minh∈H LD(h) is nonnegative and rely on Markov’s inequality.
_ For every δ ∈ (0, 1) let
mH(_, δ) = mH(_/2)_log2 (1/δ)_+
log(4/δ)+log(_log2 (1/δ)_)
_2
. Suggest a procedure that agnostic PAC learns the problem with sample complexity of mH(_, δ), assuming that the loss function is bounded by 1. Hint: Let k = _log2 (1/δ)_. Divide the data into k +1 chunks, where each of the first k chunks is of size mH(_/2) examples. Train the first k chunks using A. On the basis of the previous question argue that the probability that for all of these chunks we have LD(A(S))>minh∈H LD(h)+_ is at most 2−k ≤δ/2. Finally, use the last chunk as a validation set.