Valiant's learnability model is extended to learning classes of concepts defined by regions in Euclidean space E". The methods in this paper lead to a unified treatment of some of Valiant's results, along with previous results on distribution-free convergence of certain pattern recognition algorithms. It is shown that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned. Using this parameter, the complexity and closure properties of learnable classes are analyzed, and the necessary and sufftcient conditions are provided for feasible learnability.
Abstract. Given a finite set of texts S = (wi , *.., wk) over some fixed finite alphabet 2, a complete inverted tile for S is an abstract data type that provides the functionsfind( which returns the longest prefix of w that occurs (as a subword of a word) in S, freq(w), which returns the number of times w occurs in S, and locations(w), which returns the set of positions where w occurs in S. A data structure. that implements a complete inverted file for S that occupies linear space and can be built in linear time, using the uniform-cost RAM model, is given. Using this data structure, the time for each of the above query functions is optimal. To accomplish this, techniques from the theory of finite automata and the work on suffix trees are used to build a deterministic finite automaton that recognizes the set of all subwords of the set S. This automaton is then annotated with additional information and compacted to facilitate the desired query functions. The result is a data structure that is smaller and more flexible than the s&ix tree.
Modularity analysis is a powerful tool for studying the design of biological networks, offering potential clues for relating the biochemical function(s) of a network with the 'wiring' of its components. Relatively little work has been done to examine whether the modularity of a network depends on the physiological perturbations that influence its biochemical state. Here, we present a novel modularity analysis algorithm based on edge-betweenness centrality, which facilitates the use of directional information and measurable biochemical data.
We extend Valiant's learnability model to learning classes of concepts defined by regions in Euclidean space E". Our methods lead to a unified treatment of some of Valiant's results, along with previous results of Pearl and Devroye and Wagner on distribution-free convergence of certain pattern recognition algorithms. We show that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned. Using this parameter, we analyze the complexity and closure properties of learnable classes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.