Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions.
Background: The goal of the gene normalization task is to link genes or gene products mentioned in the literature to biological databases. This is a key step in an accurate search of the biological literature. It is a challenging task, even for the human expert; genes are often described rather than referred to by gene symbol and, confusingly, one gene name may refer to different genes (often from different organisms). For BioCreative II, the task was to list the Entrez Gene identifiers for human genes or gene products mentioned in PubMed/MEDLINE abstracts. We selected abstracts associated with articles previously curated for human genes. We provided 281 expert-annotated abstracts containing 684 gene identifiers for training, and a blind test set of 262 documents containing 785 identifiers, with a gold standard created by expert annotators. Inter-annotator agreement was measured at over 90%.
This report investigates the mechanisms by which mammalian cells coordinate DNA replication with transcription and chromatin assembly. In yeast, DNA replication initiates within nucleosome-free regions, but studies in mammalian cells have not revealed a similar relationship. Here, we have used genome-wide massively parallel sequencing to map replication initiation events, thereby creating a database of all replication initiation sites within nonrepetitive DNA in two human cell lines. Mining this database revealed that genomic regions transcribed at moderate levels were generally associated with high replication initiation frequency. In genomic regions with high rates of transcription, very few replication initiation events were detected. High-resolution mapping of replication initiation sites showed that replication initiation events were absent from transcription start sites but were highly enriched in adjacent, downstream sequences. Methylation of CpG sequences strongly affected the location of replication initiation events, whereas histone modifications had minimal effects. These observations suggest that high levels of transcription interfere with formation of pre-replication protein complexes. Data presented here identify replication initiation sites throughout the genome, providing a foundation for further analyses of DNA-replication dynamics and cell-cycle progression.
As part of the Spotlight on Molecular Profiling series, we present here new profiling studies of mRNA and microRNA expression for the 60 cell lines of the National Cancer Institute (NCI) Developmental Therapeutics program (DTP) drug screen (NCI-60) using the 41,000-probe Agilent Whole Human Genome Oligo Microarray and the 15,000-feature Agilent Human microRNA Microarray V2. The expression levels of ∼21,000 genes and 723 human microRNAs were measured. These profiling studies include quadruplicate technical replicates for six and eight cell lines for mRNA and microRNA, respectively, and duplicates for the remaining cell lines. The resulting data sets are freely available and searchable online in our CellMiner database. The result indicates high reproducibility for both platforms and an essential biological similarity across the various cell types. The mRNA and microRNA expression levels were integrated with our previously published 1,429-compound database of anticancer activity obtained from the NCI DTP drug screen. Large blocks of both mRNAs and microRNAs were identified with predominately unidirectional correlations to ∼1,300 drugs, including 121 drugs with known mechanisms of action. The data sets presented here will facilitate the identification of groups of mRNAs, microRNAs, and drugs that potentially affect and interact with one another. Mol Cancer Ther; 9(5); 1080-91. ©2010 AACR.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.