BackgroundMetagenomics is the study of genetic materials derived directly from complex microbial samples, instead of from culture. One of the crucial steps in metagenomic analysis, referred to as “binning”, is to separate reads into clusters that represent genomes from closely related organisms. Among the existing binning methods, unsupervised methods base the classification on features extracted from reads, and especially taking advantage in case of the limitation of reference database availability. However, their performance, under various aspects, is still being investigated by recent theoretical and empirical studies. The one addressed in this paper is among those efforts to enhance the accuracy of the classification.ResultsThis paper presents an unsupervised algorithm, called BiMeta, for binning of reads from different species in a metagenomic dataset. The algorithm consists of two phases. In the first phase of the algorithm, reads are grouped into groups based on overlap information between the reads. The second phase merges the groups by using an observation on l-mer frequency distribution of sets of non-overlapping reads. The experimental results on simulated and real datasets showed that BiMeta outperforms three state-of-the-art binning algorithms for both short and long reads (≥700 bp) datasets.ConclusionsThis paper developed a novel and efficient algorithm for binning of metagenomic reads, which does not require any reference database. The software implementing the algorithm and all test datasets mentioned in this paper can be downloaded at http://it.hcmute.edu.vn/bioinfo/bimeta/index.htm.Electronic supplementary materialThe online version of this article (doi:10.1186/s13015-014-0030-4) contains supplementary material, which is available to authorized users.
This paper describes some techniques based on polygon aggregation in reducing time for visibility graph in case of many obstacles. In path planning, the approaches are commonly used such as search-based, sampling-based or combinatorial planning. And visibility graph is one of the roadmaps of combinatorial planning. Building a visibility graph is a main phase in the whole process and theoretically it takes (nlogn). However, with some practical applications, for example one which has a large number of obstacles, this phase is very time-consuming. With the techniques proposed, the experiment result shows that the computing time gets a reduction factor of one-third approximately when the aggregation are used in preprocessing of building visibility graph.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.