Motivation: The computational tools used for genomic analyses are becoming increasingly sophisticated and complex. While these applications provide more accurate results, a new problem is emerging in that these pieces of software have a large number of tunable parameters. The default parameter choices are designed to work well on average across all inputs, but the most interesting experiments are often not "average". Choosing the wrong parameter values for an application can lead to significant results being overlooked, or false results being reported. This problem is exacerbated when these applications are chained together in analysis pipelines where each step introduces errors due to parameter choices. Results: We take some first steps towards generating a truly automated genomic analysis pipeline by developing a method for automatically choosing input-specific parameter values for reference-based transcript assembly.We apply the parameter advising framework, first developed for multiple sequence alignment, to optimize parameter choices for the Scallop transcript assembler. In doing so, we provide the first method for finding advisor sets for applications with large numbers of tunable parameters. This procedure can be parallelized, meaning it does not add any additional wall time. By choosing parameter values for each input, the area under the curve is increased by 28.9% over using only the default parameter choices on 1595 RNA-Seq samples in the Sequence Read Archive. This approach is general, and when applied to StringTie it increases AUC by 13.1% on a set of 65 RNA-Seq experiments from ENCODE.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.