Abstract. The existing methods of predicting with confidence give good accuracy and confidence values, but quite often are computationally inefficient. Some partial solutions have been suggested in the past. Both the original method and these solutions were based on transductive inference. In this paper we make a radical step of replacing transductive inference with inductive inference and define what we call the Inductive Confidence Machine (ICM); our main concern in this paper is the use of ICM in regression problems. The algorithm proposed in this paper is based on the Ridge Regression procedure (which is usually used for outputting bare predictions) and is much faster than the existing transductive techniques. The inductive approach described in this paper may be the only option available when dealing with large data sets.
PlantProm DB, a plant promoter database, is an annotated, non-redundant collection of proximal promoter sequences for RNA polymerase II with experimentally determined transcription start site(s), TSS, from various plant species. The first release (2002.01) of PlantProm DB contains 305 entries including 71, 220 and 14 promoters from monocot, dicot and other plants, respectively. It provides DNA sequence of the promoter regions (-200 : +51) with TSS on the fixed position +201, taxonomic/promoter type classification of promoters and Nucleotide Frequency Matrices (NFM) for promoter elements: TATA-box, CCAAT-box and TSS-motif (Inr). Analysis of TSS-motifs revealed that their composition is different in dicots and monocots, as well as for TATA and TATA-less promoters. The database serves as learning set in developing plant promoter prediction programs. One such program (TSSP) based on discriminant analysis has been created by Softberry Inc. and the application of a support ftp: vector machine approach for promoter identification is under development. PlantProm DB is available at http://mendel.cs.rhul.ac.uk/ and http://www.softberry.com/.
IntroductionDetection of serum biomarkers for early diagnosis of breast cancer remains an important goal. Changes in the structure of O-linked glycans occur in all breast cancers resulting in the expression of glycoproteins that are antigenically distinct. Indeed, the serum assay widely used for monitoring disease progression in breast cancer (CA15.3), detects a glycoprotein (MUC1), but elevated levels of the antigen cannot be detected in early stage patients. However, since the immune system acts to amplify the antigenic signal, antibodies can be detected in sera long before the antigen. We have exploited the change in O-glycosylation to measure autoantibody responses to cancer-associated glycoforms of MUC1 in sera from early stage breast cancer patients.MethodsWe used a microarray platform of 60mer MUC1 glycopeptides, to confirm the presence of autoantibodies to cancer associated glycoforms of MUC1 in a proportion of early breast cancer patients (54/198). Five positive sera were selected for detailed definition of the reactive epitopes using on chip glycosylation technology and a panel of glycopeptides based on a single MUC1 tandem repeat carrying specific glycans at specific sites. Based on these results, larger amounts of an extended repertoire of defined MUC1 glycopeptides were synthesised, printed on microarrays, and screened with sera from a large cohort of breast cancer patients (n = 395), patients with benign breast disease (n = 108) and healthy controls (n = 99). All sera were collected in the 1970s and 1980s and complete clinical follow-up of breast cancer patients is available.ResultsThe presence and level of autoantibodies was significantly higher in the sera from cancer patients compared with the controls, and a highly significant correlation with age was observed. High levels of a subset of autoantibodies to the core3MUC1 (GlcNAcβ1-3GalNAc-MUC1) and STnMUC1 (NeuAcα2,6GalNAc-MUC1) glycoforms were significantly associated with reduced incidence and increased time to metastasis.ConclusionsAutoantibodies to specific cancer associated glycoforms of MUC1 are found more frequently and at higher levels in early stage breast cancer patients than in women with benign breast disease or healthy women. Association of strong antibody response with reduced rate and delay in metastases suggests that autoantibodies can affect disease progression.
except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now know or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks and similar terms, even if the are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.
In this paper we apply Conformal Prediction (CP) to the k-Nearest Neighbours Regression (k-NNR) algorithm and propose ways of extending the typical nonconformity measure used for regression so far. Unlike traditional regression methods which produce point predictions, Conformal Predictors output predictive regions that satisfy a given confidence level. The regions produced by any Conformal Predictor are automatically valid, however their tightness and therefore usefulness depends on the nonconformity measure used by each CP. In effect a nonconformity measure evaluates how strange a given example is compared to a set of other examples based on some traditional machine learning algorithm. We define six novel nonconformity measures based on the k-Nearest Neighbours Regression algorithm and develop the corresponding CPs following both the original (transductive) and the inductive CP approaches. A comparison of the predictive regions produced by our measures with those of the typical regression measure suggests that a major improvement in terms of predictive region tightness is achieved by the new measures
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.