n f n(x) = l:>k¢;(ak. x + bk) + Co (1) k = 1 parameterized by ak E Rd and bk, Ck E R, where a. x denotes the inner product of vectors in Rd. The total number of parameters of the network is (el + 2)'11, + 1.
Abstract-We review the principles of Minimum Description Length and Stochastic Complexity as used in data compression and statistical modeling. Stochastic complexity is formulated as the solution to optimum universal coding problems extending Shannon's basic source coding theorem. The normalized maximized likelihood, mixture, and predictive codings are each shown to achieve the stochastic complexity to within asymptotically vanishing terms. We assess the performance of the minimum description length criterion both from the vantage point of quality of data compression and accuracy of statistical inference. Context tree modeling, density estimation, and model selection in Gaussian linear regression serve as examples.
Performance bounds for criteria for model selection are developed using recent theory for sieves. The model selection criteria are based on an empirical loss or contrast function with an added penalty term motivated by empirical process theory and roughly proportional to the number of parameters needed to describe the model divided by the number of observations. Most of our examples involve density or regression estimation settings and we focus on the problem of estimating the unknown density or regression function. We show that the quadratic risk of the minimum penalized empirical contrast estimator is bounded by an index of the accuracy of the sieve. This accuracy index quantifies the trade-off among the candidate models between the approximation error and parameter dimension relative to sample size.If we choose a list of models which exhibit good approximation properties with respect to different classes of smoothness, the estimator can be simultaneously minimax rate optimal in each of those classes. This is what is usually called adaptation. The type of classes of smoothness in which one gets adaptation depends heavily on the list of models. If too many models are involved in order to get accurate approximation of many wide classes of functions simultaneously, it may happen that the estimator is only approxWork supported in part by the NSF grant ECS-9410760, and URA CNRS 1321 "Statistique et modèles aléatoires", and URA CNRS 743 "Modélisation stochastique et Statistique".A. Barron et al. imately adaptive (typically up to a slowly varying function of the sample size).We shall provide various illustrations of our method such as penalized maximum likelihood, projection or least squares estimation. The models will involve commonly used finite dimensional expansions such as piecewise polynomials with fixed or variable knots, trigonometric polynomials, wavelets, neural nets and related nonlinear expansions defined by superposition of ridge functions.
Staphylococcus aureus is an important nosocomial and communityacquired pathogen. Its genetic plasticity has facilitated the evolution of many virulent and drug-resistant strains, presenting a major and constantly changing clinical challenge. We sequenced the Ϸ2.8-Mbp genomes of two disease-causing S. aureus strains isolated from distinct clinical settings: a recent hospital-acquired representative of the epidemic methicillin-resistant S. aureus EMRSA-16 clone (MRSA252), a clinically important and globally prevalent lineage; and a representative of an invasive communityacquired methicillin-susceptible S. aureus clone (MSSA476). A comparative-genomics approach was used to explore the mechanisms of evolution of clinically important S. aureus genomes and to identify regions affecting virulence and drug resistance. The genome sequences of MRSA252 and MSSA476 have a well conserved core region but differ markedly in their accessory genetic elements. MRSA252 is the most genetically diverse S. aureus strain sequenced to date: Ϸ6% of the genome is novel compared with other published genomes, and it contains several unique genetic elements. MSSA476 is methicillin-susceptible, but it contains a novel Staphylococcal chromosomal cassette (SCC) mec-like element (designated SCC 476), which is integrated at the same site on the chromosome as SCCmec elements in MRSA strains but encodes a putative fusidic acid resistance protein. The crucial role that accessory elements play in the rapid evolution of S. aureus is clearly illustrated by comparing the MSSA476 genome with that of an extremely closely related MRSA community-acquired strain; the differential distribution of large mobile elements carrying virulence and drug-resistance determinants may be responsible for the clinically important phenotypic differences in these strains.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.