Abstract. Many optimization problems cannot be solved by classical mathematical optimization techniques due to their complexity and the size of the solution space. In order to achieve solutions of high quality though, heuristic optimization algorithms are frequently used. These algorithms do not claim to find global optimal solutions, but offer a reasonable tradeoff between runtime and solution quality and are therefore especially suitable for practical applications. In the last decades the success of heuristic optimization techniques in many different problem domains encouraged the development of a broad variety of optimization paradigms which often use natural processes as a source of inspiration (as for example evolutionary algorithms, simulated annealing, or ant colony optimization). For the development and application of heuristic optimization algorithms in science and industry, mature, flexible and usable software systems are required. These systems have to support scientists in the development of new algorithms and should also enable users to apply different optimization methods on specific problems easily. The architecture and design of such heuristic optimization software systems impose many challenges on developers due to the diversity of algorithms and problems as well as the heterogeneous requirements of the different user groups. In this chapter the authors describe the architecture and design of their optimization environment HeuristicLab which aims to provide a comprehensive system for algorithm development, testing, analysis and generally the application of heuristic optimization methods on complex problems.
De novo peptide sequencing algorithms are often tested on relatively small data sets made of excellent spectra. Since there are always more and more tandem mass spectra available, we have assembled six large, reliable, and diverse (three mass spectrometer types) data sets intended for such tests and we make them accessible via a web server. To exemplify their use we investigate the performance of Lutefisk, PepNovo, and PepNovoTag, three well-established peptide de novo sequencing programs.
The Gene Expression Omnibus (GEO) is the largest resource of public gene expression data. While GEO enables data browsing, query and retrieval, additional tools can help realize its potential for aggregating and comparing data across multiple studies and platforms. This paper describes DSGeo - a collection of valuable tools that were developed for annotating, aggregating, integrating and analyzing data deposited in GEO. The core set of tools include a Relational Database, a Data Loader, a Data Browser and an Expression Combiner and Analyzer.
The application enables querying for specific sample characteristics and identifying studies containing samples that match the query. The Expression Combiner application enables normalization and aggregation of data from these samples and returns these data to the user after filtering, according to the user’s preferences. The Expression Analyzer allows simple statistical comparisons between groups of data. This seamless integration makes annotated cross-platform data directly available for analysis.
Background: This study describes a large-scale manual re-annotation of data samples in the Gene Expression Omnibus (GEO), using variables and values derived from the National Cancer Institute thesaurus. A framework is described for creating an annotation scheme for various diseases that is flexible, comprehensive, and scalable. The annotation structure is evaluated by measuring coverage and agreement between annotators.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.