Motivation: Whole metagenome shotgun sequencing is a powerful approach for assaying the functional potential of microbial communities. We currently lack tools that efficiently and accurately align DNA reads against protein references, the technique necessary for constructing a functional profile. Here, we present PALADIN-a novel modification of the Burrows-Wheeler Aligner that provides accurate alignment, robust reporting capabilities and orders-of-magnitude improved efficiency by directly mapping in protein space. Results: We compared the accuracy and efficiency of PALADIN against existing tools that employ nucleotide or protein alignment algorithms. Using simulated reads, PALADIN consistently outperformed the popular DNA read mappers BWA and NovoAlign in detected proteins, percentage of reads mapped and ontological similarity. We also compared PALADIN against four existing protein alignment tools: BLASTX, RAPSearch2, DIAMOND and Lambda, using empirically obtained reads. PALADIN yielded results seven times faster than the best performing alternative, DIAMOND and nearly 8000 times faster than BLASTX. PALADIN's accuracy was comparable to all tested solutions.
Geosmithia morbida is a filamentous ascomycete that causes thousand cankers disease in the eastern black walnut tree. This pathogen is commonly found in the western U.S.; however, recently the disease was also detected in several eastern states where the black walnut lumber industry is concentrated. G. morbida is one of two known phytopathogens within the genus Geosmithia, and it is vectored into the host tree via the walnut twig beetle. We present the first de novo draft genome of G. morbida. It is 26.5 Mbp in length and contains less than 1% repetitive elements. The genome possesses an estimated 6,273 genes, 277 of which are predicted to encode proteins with unknown functions. Approximately 31.5% of the proteins in G. morbida are homologous to proteins involved in pathogenicity, and 5.6% of the proteins contain signal peptides that indicate these proteins are secreted. Several studies have investigated the evolution of pathogenicity in pathogens of agricultural crops; forest fungal pathogens are often neglected because research efforts are focused on food crops. G. morbida is one of the few tree phytopathogens to be sequenced, assembled and annotated. The first draft genome of G. morbida serves as a valuable tool for comprehending the underlying molecular and evolutionary mechanisms behind pathogenesis within the Geosmithia genus.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.