It has been suggested that insertions and deletions (indels) have contributed to the sequence divergence between the human and chimpanzee genomes more than do nucleotide changes (3% vs. 1.2%). However, although there have been studies of large indels between the two genomes, no systematic analysis of small indels (i.e., indels Յ 100 bp) has been published. In this study, we first estimated that the false-positive rate of small indels inferred from human-chimpanzee pairwise sequence alignments is quite high, suggesting that the chimpanzee genome draft is not sufficiently accurate for our purpose. We have therefore inferred only human-specific indels using multiple sequence alignments of mammalian genomes. We identified >840,000 "small" indels, which affect >7000 UCSC-annotated human genes (>11,000 transcripts). These indels, however, amount to only ∼0.21% sequence change in the human lineage for the regions compared, whereas in pseudogenes indels contribute to a sequence divergence of 1.40%, suggesting that most of the indels that occurred in genic regions have been eliminated. Functional analysis reveals that the genes whose coding exons have been affected by human-specific indels are enriched in transcription and translation regulatory activities but are underrepresented in catalytic and transporter activities, cellular and physiological processes, and extracellular region/matrix. This functional bias suggests that human-specific indels might have contributed to human unique traits by causing changes at the RNA and protein level.[Supplemental material is available online at www.genome.org.]The recent publication of the chimpanzee genome draft (The Chimpanzee Genome Sequencing and Analysis Consortium [TCGSAC] 2005) has brought unprecedented opportunities for investigating the genetic basis of the morphological and behavior differences between human and chimpanzee, human's closest relative. Three molecular mechanisms have been proposed to explain human-specific traits: amino acid substitutions, exon deletions, and substitutions in regulatory regions (Li and Saunders 2005). The TCGSAC draft confirmed the previously estimated ∼1.2% Homo-Pan divergence due to nucleotide substitution (Chen and Li 2001;Ebersberger et al. 2002;Clark et al. 2003;Frazer et al. 2003;Watanabe et al. 2004). The nucleotide substitutions in coding exons result in an average of two amino acid substitutions, one per lineage, between Homo-Pan orthologous genes. Although recent studies (Clark et al. 2003;Nielsen et al. 2005) suggested that certain functional categories of genes show evidence of positive selection in the human lineage, the implicated genes did not appear to be directly related to human unique traits. Moreover, the relationship between promoter region divergence and expression divergence between human and chimpanzee remains unclear (Heissig et al. 2005), although there is substantial expression divergence between the two species (Marvanova et al. 2003;Khaitovich et al. 2005).To have a better understanding of the genetic difference...
Insertion and deletion (indel) events usually have dramatic effects on genome structure and gene function. Species-specific indels have been demonstrated to be associated with species-unique traits. Currently, indel identifications mainly rely on pair-wise sequence alignments (the ‘pair-wise indels’), which suffer lack of discrimination of species specificity and insertion versus deletion. Also, there is no freely accessible web server for genome-wide identification of indels. Therefore, we develop a web server—INDELSCAN— to identify four types of indels using multiple sequence alignments that include sequences from one target, one subject and ≥1 out-group species. The four types of indels identified encompass target species-specific, subject species-specific, non-species-specific and target-subject pair-wise indels. Insertions and deletions are discriminated with reference to out-group sequences. The genomic locations (5′UTR, intron, CDS, 3′UTR and intergenic region) of these indels are also provided for functional analysis. INDELSCAN provides genomic sequences and gene annotations from a wide spectrum of taxa for users to select from, including nine target species (human (Homo sapiens), mouse (Mus musculus), rat (Rattus norvegicus), dog (Canis familiaris), opossum (Monodelphis domestica), chicken (Gallus gallus), zebrafish (Danio rerio), fly (Drosophila melanogaster) and yeast (Saccharomyces cerevisiae) and >35 subject/out-group species, ranging from yeasts to mammals. The server also provides analytic figures and supports indel identification from user-uploaded alignments/annotations. INDELSCAN is freely accessible at http://indelscan.genomics.sinica.edu.tw/IndelScan/.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.