1. On the basis of the preceding, prove that for any k ≥ 3, the ERMHnk problem is NP-hard.
2 In this exercise we show that hardness of solving the ERM problem is equivalent to hardness of proper PAC learning. Recall that by “properness” of the algorithm we mean that it must output a hypothesis from the hypothesis class. To formalize this statement, we first need the following definition. Definition 8.2. The complexity class Randomized Polynomial (RP) time is the class of all decision problems (that is, problems in which on any instance one has to find out whether the answer is YES or NO) for which there exists a probabilistic algorithm (namely, the algorithm is allowed to flip random coins while it is running) with these properties: _ On any input instance the algorithm runs in polynomial time in the input size. _ If the correct answer is NO, the algorithm must return NO. _ If the correct answer is YES, the algorithm returns YES with probability a ≥1/2 and returns NO with probability 1−a.1
Clearly the class RP contains the class P. It is also known that RP is contained in the class NP. It is not known whether any equality holds among these three complexity classes, but it is widely believed that NP is strictly larger than RP. In particular, it is believed that NP-hard problems cannot be solved by a randomized polynomial time algorithm. _ Show that if a class H is properly PAC learnable by a polynomial time algorithm, then the ERMH problem is in the class RP. In particular, this implies that whenever the ERMH problem is NP-hard (for example, the class of intersections of
halfspaces discussed in the previous exercise), then, unless NP = RP, there exists no polynomial time proper PAC learning algorithm for H. Hint: Assume you have an algorithm A that properly PAC learns a class H in
time polynomial in some class parameter n as well as in 1/_ and 1/δ. Your goal is to use that algorithm as a subroutine to contract an algorithm B for solving the ERMH problem in random polynomial time. Given a training set, S ∈ (X × {±1}m), and some h ∈ H whose error on S is zero, apply the PAC learning algorithm to the uniform distribution over S and run it so that with probability ≥ 0.3 it finds a function h ∈H that has error less than _ = 1/|S| (with respect to that uniform distribution). Show that the algorithm just described satisfies the requirements for being a RP solver for ERMH.