We have produced a draft sequence of the rice genome for the most widely cultivated subspecies in China, Oryza sativa L. ssp. indica, by whole-genome shotgun sequencing. The genome was 466 megabases in size, with an estimated 46,022 to 55,615 genes. Functional coverage in the assembled sequences was 92.0%. About 42.2% of the genome was in exact 20-nucleotide oligomer repeats, and most of the transposons were in the intergenic regions between genes. Although 80.6% of predicted Arabidopsis thaliana genes had a homolog in rice, only 49.4% of predicted rice genes had a homolog in A. thaliana. The large proportion of rice genes with no recognizable homologs is due to a gradient in the GC content of rice coding sequences.
It is a challenge to classify protein-coding or non-coding transcripts, especially those re-constructed from high-throughput sequencing data of poorly annotated species. This study developed and evaluated a powerful signature tool, Coding-Non-Coding Index (CNCI), by profiling adjoining nucleotide triplets to effectively distinguish protein-coding and non-coding sequences independent of known annotations. CNCI is effective for classifying incomplete transcripts and sense–antisense pairs. The implementation of CNCI offered highly accurate classification of transcripts assembled from whole-transcriptome sequencing data in a cross-species manner, that demonstrated gene evolutionary divergence between vertebrates, and invertebrates, or between plants, and provided a long non-coding RNA catalog of orangutan. CNCI software is available at http://www.bioinfo.org/software/cnci.
The mammalian Retinoblastoma (RB) family including pRB, p107, and p130 represses E2F target genes through mechanisms that are not fully understood. In D. melanogaster, RB-dependent repression is mediated in part by the multisubunit protein complex Drosophila RBF, E2F, and Myb (dREAM) that contains homologs of the C. elegans synthetic multivulva class B (synMuvB) gene products. Using an integrated approach combining proteomics, genomics, and bioinformatic analyses, we identified a p130 complex termed DP, RB-like, E2F, and MuvB (DREAM) that contains mammalian homologs of synMuvB proteins LIN-9, LIN-37, LIN-52, LIN-54, and LIN-53/RBBP4. DREAM bound to more than 800 human promoters in G0 and was required for repression of E2F target genes. In S phase, MuvB proteins dissociated from p130 and formed a distinct submodule that bound MYB. This work reveals an evolutionarily conserved multisubunit protein complex that contains p130 and E2F4, but not pRB, and mediates the repression of cell cycle-dependent genes in quiescence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.