Arabidopsis thaliana is the most widely-studied plant today. The concerted efforts of over 11 000 researchers and 4000 organizations around the world are generating a rich diversity and quantity of information and materials. This information is made available through a comprehensive on-line resource called the Arabidopsis Information Resource (TAIR) (http://arabidopsis.org), which is accessible via commonly used web browsers and can be searched and downloaded in a number of ways. In the last two years, efforts have been focused on increasing data content and diversity, functionally annotating genes and gene products with controlled vocabularies, and improving data retrieval, analysis and visualization tools. New information include sequence polymorphisms including alleles, germplasms and phenotypes, Gene Ontology annotations, gene families, protein information, metabolic pathways, gene expression data from microarray experiments and seed and DNA stocks. New data visualization and analysis tools include SeqViewer, which interactively displays the genome from the whole chromosome down to 10 kb of nucleotide sequence and AraCyc, a metabolic pathway database and map tool that allows overlaying expression data onto the pathway diagrams. Finally, we have recently incorporated seed and DNA stock information from the Arabidopsis Biological Resource Center (ABRC) and implemented a shopping-cart style on-line ordering system.
Controlled vocabularies are increasingly used by databases to describe genes and gene products because they facilitate identification of similar genes within an organism or among different organisms. One of The Arabidopsis Information Resource's goals is to associate all Arabidopsis genes with terms developed by the Gene Ontology Consortium that describe the molecular function, biological process, and subcellular location of a gene product. We have also developed terms describing Arabidopsis anatomy and developmental stages and use these to annotate published gene expression data. As of March 2004, we used computational and manual annotation methods to make 85,666 annotations representing 26,624 unique loci. We focus on associating genes to controlled vocabulary terms based on experimental data from the literature and use The Arabidopsis Information Resource-developed PubSearch software to facilitate this process. Each annotation is tagged with a combination of evidence codes, evidence descriptions, and references that provide a robust means to assess data quality. Annotation of all Arabidopsis genes will allow quantitative comparisons between sets of genes derived from sources such as microarray experiments. The Arabidopsis annotation data will also facilitate annotation of newly sequenced plant genomes by using sequence similarity to transfer annotations to homologous genes. In addition, complete and up-to-date annotations will make unknown genes easy to identify and target for experimentation. Here, we describe the process of Arabidopsis functional annotation using a variety of data sources and illustrate several ways in which this information can be accessed and used to infer knowledge about Arabidopsis and other plant species.
The Arabidopsis Information Resource (TAIR; http://arabidopsis.org) provides an integrated view of genomic data for Arabidopsis thaliana. The information is obtained from a battery of sources, including the Arabidopsis user community, the literature, and the major genome centers. Currently TAIR provides information about genes, markers, polymorphisms, maps, sequences, clones, DNA and seed stocks, gene families and proteins. In addition, users can find Arabidopsis publications and information about Arabidopsis researchers. Our emphasis is now on incorporating functional annotations of genes and gene products, genome-wide expression, and biochemical pathway data. Among the tools developed at TAIR, the most notable is the Sequence Viewer, which displays gene annotation, clones, transcripts, markers and polymorphisms on the Arabidopsis genome, and allows zooming in to the nucleotide level. A tool recently released is AraCyc, which is designed for visualization of biochemical pathways. We are also developing tools to extract information from the literature in a systematic way, and building controlled vocabularies to describe biological concepts in collaboration with other database groups. A significant new feature is the integration of the ABRC database functions and stock ordering system, which allows users to place orders for seed and DNA stocks directly from the TAIR site.
Genome-wide association studies have linked common variation in ZNF804A with an increased risk of schizophrenia. However, little is known about the biology of ZNF804A and its role in schizophrenia. Here, we investigate the function of ZNF804A using a variety of complementary molecular techniques. We show that ZNF804A is a nuclear protein that interacts with neuronal RNA splicing factors and RNA-binding proteins including RBFOX1, which is also associated with schizophrenia, CELF3/4, components of the ubiquitin-proteasome system and the ZNF804A paralog, GPATCH8. GPATCH8 also interacts with splicing factors and is localized to nuclear speckles indicative of a role in pre-messenger RNA (mRNA) processing. Sequence analysis showed that GPATCH8 contains ultraconserved, alternatively spliced poison exons that are also regulated by RBFOX proteins. ZNF804A knockdown in SH-SY5Y cells resulted in robust changes in gene expression and pre-mRNA splicing converging on pathways associated with nervous system development, synaptic contact, and cell adhesion. We observed enrichment (P = 1.66 × 10–9) for differentially spliced genes in ZNF804A-depleted cells among genes that contain RBFOX-dependent alternatively spliced exons. Differentially spliced genes in ZNF804A-depleted cells were also enriched for genes harboring de novo loss of function mutations in autism spectrum disorder (P = 6.25 × 10–7, enrichment 2.16) and common variant alleles associated with schizophrenia (P = .014), bipolar disorder and schizophrenia (P = .003), and autism spectrum disorder (P = .005). These data suggest that ZNF804A and its paralogs may interact with neuronal-splicing factors and RNA-binding proteins to regulate the expression of a subset of synaptic and neurodevelopmental genes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.