The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.
The compact genome of Fugu rubripes has been sequenced to over 95% coverage, and more than 80% of the assembly is in multigene-sized scaffolds. In this 365-megabase vertebrate genome, repetitive DNA accounts for less than one-sixth of the sequence, and gene loci occupy about one-third of the genome. As with the human genome, gene loci are not evenly distributed, but are clustered into sparse and dense regions. Some "giant" genes were observed that had average coding sequence sizes but were spread over genomic lengths significantly larger than those of their human orthologs. Although three-quarters of predicted human proteins have a strong match to Fugu, approximately a quarter of the human proteins had highly diverged from or had no pufferfish homologs, highlighting the extent of protein evolution in the 450 million years since teleosts and mammals diverged. Conserved linkages between Fugu and human genes indicate the preservation of chromosomal segments from the common vertebrate ancestor, but with considerable scrambling of gene order.
We analyzed the whole genome sequences of a family of four, consisting of two siblings and their parents. Family-based sequencing allowed us to delineate recombination sites precisely, identify 70% of the sequencing errors, and identify very rare SNVs. We also directly estimated a human intergeneration mutation rate of ∼1.1×10-8 per position per haploid genome. Both offspring in this family have two recessive disorders--Miller syndrome, for which the gene was concurrently identified, and primary ciliary dyskinesia, for which causative genes have been previously identified. Family-based genome analysis enabled us to narrow the candidate genes for both of these Mendelian disorders to only four. Our results demonstrate the unique value of complete genome sequencing in families.
The complete sequences of Takifugu Toll-like receptor (TLR) loci and gene predictions from many draft genomes enable comprehensive molecular phylogenetic analysis. Strong selective pressure for recognition of and response to pathogen-associated molecular patterns has maintained a largely unchanging TLR recognition in all vertebrates. There are six major families of vertebrate TLRs. This repertoire is distinct from that of invertebrates. TLRs within a family recognize a general class of pathogen-associated molecular patterns. Most vertebrates have exactly one gene ortholog for each TLR family. The family including TLR1 has more speciesspecific adaptations than other families. A major family including TLR11 is represented in humans only by a pseudogene. Coincidental evolution plays a minor role in TLR evolution. The sequencing phase of this study produced finished genomic sequences for the 12 Takifugu rubripes TLRs. In addition, we have produced >70 gene models, including sequences from the opossum, chicken, frog, dog, sea urchin, and sea squirt. coincidental evolution ͉ multigene family ͉ concerted evolution T he Toll-like receptor (TLR) multigene family encodes important recognition receptors of the innate immune system that have been conserved in both the invertebrate and vertebrate lineages (1, 2). TLRs recognize a variety of endogenous and exogenous ligands; many of the latter are conserved molecules essential for pathogen survival. TLR genes have been recognized in a number of vertebrate genomes, and many partial and full-length sequences are available. Recent additions include draft predictions from the Japanese pufferfish Takifugu rubripes (3), the zebrafish Danio rerio (4-6), and the chicken Gallus gallus (7), and partially or fully sequenced mRNAs, including one from the goldfish Carassius auratus (8), several from the Japanese flounder Paralichthys olivaceus (9), and several from the rainbow trout Oncorhynchus mykiss (10). These papers provide incremental molecular phylogenetic analyses, and several reviews are available (11-13). Additionally, the draft genomes of the frog Xenopus tropicalis, chicken G. gallus, and opossum Monodelphis domesticus are now available. We present a complete molecular phylogenetic analysis of the known vertebrate TLR genes in the context of the complete genomic sequences of the T. rubripes TLRs. MethodsSequencing and Assembly. A draft genome sequence of T. rubripes was obtained by pairwise shotgun sequencing (14) through the efforts of an international collaboration (15). Sequence finishing was performed in part as described (16), with additional details provided in Supporting Text, which is published as supporting information on the PNAS web site.Bioinformatics. TLRs were identified as genes coding for both an N-terminal leucine-rich repeat (LRR) domain and a C-terminal Toll-IL-resistance (TIR) domain. To form the basis of our study, vertebrate sequences from the nonredundant DDBJ͞EMBL͞ NCBI database (GenBank) were identified by similarity to known TLRs (Data Set 1, which is...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.