The ultimate goal of metagenome research projects is to understand the ecological roles and physiological functions of the microbial communities in a given natural environment. The 454 pyrosequencing platform produces the longest reads among the most widely used next generation sequencing platforms. Since the relatively longer reads of the 454 platform provide more information for identification of microbial sequences, this platform is dedicated to microbial community and population studies. In order to accurately perform the downstream analysis of the 454 multiplex datasets, it is necessary to remove artificially designed sequences located at either ends of individual reads and to correct low-quality sequences. We have developed a program called PyroTrimmer that removes the barcodes, linkers, and primers, trims sequence regions with low quality scores, and filters out low-quality sequence reads. Although these functions have previously been implemented in other programs as well, PyroTrimmer has novelty in terms of the following features: i) more sensitive primer detection using Levenstein distance and global pairwise alignment, ii) the first stand-alone software with a graphic user interface, and iii) various options for trimming and filtering out the low-quality sequence reads. PyroTrimmer, written in JAVA, is compatible with multiple operating systems and can be downloaded free at http://pyrotrimmer.kobic.re.kr.
| The intermediate result cardinality { t h e n umber of objects satisfying a condition given in a query { is an important factor for estimating the cost of the query in query optimization. In this paper we s h o w that an object-oriented query often involves partial participation of classes in a relationship. We then present a n e w t e c hnique for estimating the intermediate result cardinality i n such a query. Partial participation has not been considered seriously in existing techniques. Since the proposed technique uses detailed statistics to accommodate partial participation, it estimates the intermediate result cardinality more accurately than existing ones. We also show that these statistics are easily obtained by using inherent properties of object-oriented databases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.