In response to a need for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, the National Center for Biotechnology Information (NCBI) has established the dbSNP database [S.T.Sherry, M.Ward and K. Sirotkin (1999) Genome Res., 9, 677-679]. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data. The complete contents of dbSNP are available to the public at website: http://www.ncbi.nlm.nih.gov/SNP. The complete contents of dbSNP can also be downloaded in multiple formats via anonymous FTP at ftp://ncbi.nlm.nih.gov/snp/.
StraplineThe National Center for Biotechnology Information has created the dbGaP public repository for individual-level phenotype, exposure, genotype, and sequence data, and the associations between them. dbGaP assigns stable, unique identifiers to studies and subsets of information from those studies, including documents, individual phenotypic variables, tables of trait data, sets of genotype data, computed phenotype-genotype associations and groups of study subjects who have given similar consents for use of their data. IntroductionThe technical advances and declining costs for high-throughput genotyping afford investigators fresh opportunities to do increasingly complex analyses of genetic associations with phenotypic and disease characteristics. The leading candidates for such genome wide association studies (GWAS) are existing large-scale cohort and clinical studies that collected rich sets of phenotype data. To support investigator access to data from these initiatives at the National Institutes of Health (NIH) and elsewhere, the National Center for Biotechnology Information (NCBI) has created a database of Genotypes and Phenotypes (dbGaP) with stable identifiers that make it possible for published studies to discuss or cite the primary data in a specific and uniform way. dbGaP provides unprecedented access to the large-scale genetic and phenotypic datasets required for GWAS designs, including public access to study documents linked to summary data on specific phenotype variables, statistical overviews of the genetic information, position of published associations on the genome, and authorized access to individual-level data.The purposes of this description of dbGaP are three-fold: (1) to describe dbGaP's functionality for users and submitters; (2) to describe dbGaP's design and operational processes for database methodologists to emulate or improve upon; and (3) to reassure the lay and scientific public that individual-level phenotype and genotype data are securely and responsibly managed. dbGaP accommodates studies of varying design. It contains four basic types of data: (1) Study documentation, including study descriptions, protocol documents, and data collection instruments, such as questionnaires; (2) Phenotypic data for each variable assessed, at both an individual level and in summary form; (3) Genetic data, including study subjects' individual genotypes, pedigree information, fine mapping results, and resequencing traces; and (4) Statistical results, including association and linkage analyses, when available.Address editorial correspondence to: Stephen Sherry, PhD, National Center for Biotechnology Information, 8600 Rockville Pike, MSC 3804, Bethesda, MD 20894-3804, phone: 301-435-7799, fax: 301-480-5789, e-mail: sherry@ncbi.nlm To protect the confidentiality of study subjects, dbGaP accepts only de-identified data and requires investigators to go through an authorization process in order to access individual-level phenotype and genotype datasets. Summary phenotype and genotype data, as well as stu...
PubMed is a free search engine for biomedical literature accessed by millions of users from around the world each day. With the rapid growth of biomedical literature—about two articles are added every minute on average—finding and retrieving the most relevant papers for a given query is increasingly challenging. We present Best Match, a new relevance search algorithm for PubMed that leverages the intelligence of our users and cutting-edge machine-learning technology as an alternative to the traditional date sort order. The Best Match algorithm is trained with past user searches with dozens of relevance-ranking signals (factors), the most important being the past usage of an article, publication date, relevance score, and type of article. This new algorithm demonstrates state-of-the-art retrieval performance in benchmarking experiments as well as an improved user experience in real-world testing (over 20% increase in user click-through rate). Since its deployment in June 2017, we have observed a significant increase (60%) in PubMed searches with relevance sort order: it now assists millions of PubMed searches each week. In this work, we hope to increase the awareness and transparency of this new relevance sort option for PubMed users, enabling them to retrieve information more effectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.