BackgroundHistorically, two categories of computational algorithms (alignment-based and alignment-free) have been applied to sequence comparison–one of the most fundamental issues in bioinformatics. Multiple sequence alignment, although dominantly used by biologists, possesses both fundamental as well as computational limitations. Consequently, alignment-free methods have been explored as important alternatives in estimating sequence similarity. Of the alignment-free methods, the string composition vector (CV) methods, which use the frequencies of nucleotide or amino acid strings to represent sequence information, show promising results in genome sequence comparison of prokaryotes. The existing CV-based methods, however, suffer certain statistical problems, thereby underestimating the amount of evolutionary information in genetic sequences.ResultsWe show that the existing string composition based methods have two problems, one related to the Markov model assumption and the other associated with the denominator of the frequency normalization equation. We propose an improved complete composition vector method under the assumption of a uniform and independent model to estimate sequence information contributing to selection for sequence comparison. Phylogenetic analyses using both simulated and experimental data sets demonstrate that our new method is more robust compared with existing counterparts and comparable in robustness with alignment-based methods.ConclusionWe observed two problems existing in the currently used string composition methods and proposed a new robust method for the estimation of evolutionary information of genetic sequences. In addition, we discussed that it might not be necessary to use relatively long strings to build a complete composition vector (CCV), due to the overlapping nature of vector strings with a variable length. We suggested a practical approach for the choice of an optimal string length to construct the CCV.
BackgroundSeven-transmembrane region-containing receptors (7TMRs) play central roles in eukaryotic signal transduction. Due to their biomedical importance, thorough mining of 7TMRs from diverse genomes has been an active target of bioinformatics and pharmacogenomics research. The need for new and accurate 7TMR/GPCR prediction tools is paramount with the accelerated rate of acquisition of diverse sequence information. Currently available and often used protein classification methods (e.g., profile hidden Markov Models) are highly accurate for identifying their membership information among already known 7TMR subfamilies. However, these alignment-based methods are less effective for identifying remote similarities, e.g., identifying proteins from highly divergent or possibly new 7TMR families. In this regard, more sensitive (e.g., alignment-free) methods are needed to complement the existing protein classification methods. A better strategy would be to combine different classifiers, from more specific to more sensitive methods, to identify a broader spectrum of 7TMR protein candidates.DescriptionWe developed a Web server, 7TMRmine, by integrating alignment-free and alignment-based classifiers specifically trained to identify candidate 7TMR proteins as well as transmembrane (TM) prediction methods. This new tool enables researchers to easily assess the distribution of GPCR functionality in diverse genomes or individual newly-discovered proteins. 7TMRmine is easily customized and facilitates exploratory analysis of diverse genomes. Users can integrate various alignment-based, alignment-free, and TM-prediction methods in any combination and in any hierarchical order. Sixteen classifiers (including two TM-prediction methods) are available on the 7TMRmine Web server. Not only can the 7TMRmine tool be used for 7TMR mining, but also for general TM-protein analysis. Users can submit protein sequences for analysis, or explore pre-analyzed results for multiple genomes. The server currently includes prediction results and the summary statistics for 68 genomes.Conclusion7TMRmine facilitates the discovery of 7TMR proteins. By combining prediction results from different classifiers in a multi-level filtering process, prioritized sets of 7TMR candidates can be obtained for further investigation. 7TMRmine can be also used as a general TM-protein classifier. Comparisons of TM and 7TMR protein distributions among 68 genomes revealed interesting differences in evolution of these protein families among major eukaryotic phyla.
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
Bighead carp (Hypophthalmichthys nobilis) and silver carp (H. molitrix), collectively called bigheaded carps, are cyprinids native mainly to China and have been introduced to over 70 countries. Paleontological and molecular phylogenetic analyses demonstrate bighead and silver carps originated from the Yangtze‐Huanghe River basins and modern populations may have derived from the secondary contact of geographically isolated fish during the last glacial events. Significant genetic differences are found among populations of native rivers (Yangtze, Pearl, and Amur) as well as introduced/invasive environments (Mississippi R., USA and Danube R., Hungary), suggesting genetic backgrounds and ecological selection may play a role in population differentiation. Population divergence of bighead carp or silver carp has occurred within their native rivers, whereas, within the Mississippi River Basin (MRB)—an introduced region, such genetic differentiation is likely taking place at least in silver carp. Interspecific hybridization between silver and bighead carps is rare within their native regions; however, extensive hybridization is observed in the MRB, which could be contributed by a shift to a more homogenous environment that lacks reproductive isolation barriers for the restriction of gene flow between species. The wild populations of native bighead and silver carps have experienced dramatic declines; in contrast, the introduced bigheaded carps overpopulate the MRB and are considered two invasive species, which strongly suggests fishing capacity (overfishing and underfishing) be a decisive factor for fishery resource exploitation and management. This review provides not only a global perspective of evolutionary history and population divergence of bigheaded carps but also a forum that calls for international research collaborations to deal with critical issues related to native population conservation and invasive species control.
Bighead carp (Hypophthalmichthys nobilis) and silver carp (Hypophthalmichthys molitrix), collectively called bigheaded carps, are invasive species in the Mississippi River Basin (MRB). Interspecific hybridization between bigheaded carps has been considered rare within their native rivers in China; however, it is prevalent in the MRB. We conducted de novo transcriptome analysis of pure and hybrid bigheaded carps and obtained 40,759 to 51,706 transcripts for pure, F1 hybrid, and backcross bigheaded carps. The search against protein databases resulted in 20,336–28,133 annotated transcripts (over 50% of the transcriptome) with over 13,000 transcripts mapped to 23 Gene Ontology biological processes and 127 KEGG metabolic pathways. More transcripts were detected in silver carp than in bighead carp; however, comparable numbers of transcripts were annotated. Transcriptomic variation detected between two F1 hybrids may indicate a potential loss of fitness in hybrids. The neighbor‐joining distance tree constructed using over 2,500 one‐to‐one orthologous sequences suggests transcriptomes could be used to infer the history of introgression and hybridization. Moreover, we detected 24,792 candidate SNPs that can be used to identify different species. The transcriptomes, orthologous sequences, and candidate SNPs obtained in this study should provide further knowledge of interspecific hybridization and introgression.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.