The original and most widely studied PAC model for learning assumes a passive learner in the sense that the learner plays no role in obtaining information about the unknown concept. That is, the samples are simply drawn independently from some probability distribution. Some work has been done on studying more powerful oracles and how they affect learnability. To find bounds on the improvement that can be expected from using oracles, we consider active learning in the sense that the learner has complete choice in the information received. Specifically, we allow the learner to ask arbitrary yes/no questions. We consider both active learning under a fixed distribution and distribution-free active learning. In the case of active learning, the underlying probability distribution is used only to measure distance between concepts. For learnability with respect to a fixed distribution, active learning does not enlarge the set of learnable concept classes, but can improve the sample complexity. For distribution-free learning, it is shown that a concept class is actively learnable iff it is finite, so that active learning is in fact less powerful than the usual passive learning model. We also consider a form of distribution-free learning in which the learner knows the distribution being used, so that 'distribution-free' refers only to the requirement that a bound on the number of queries can be obtained uniformly over all distributions. Even with the side information of the distribution being used, a concept class is actively learnable iff it has finite VC dimension, so that active learning with the side information still does not enlarge the set of learnable concept classes.*''IThis work was supported by the U.S. Army Research Office under Contract DAAL03-86-K1-0171 , by the Department. of the Na.vy under Air Force Contract F119628-9U-C-0002, and by the National Science Foundation under con t ract ECS-8552419.
Report Documentation PageForm Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number.