Erica Suh scite author profile

BackgroundThe Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function.ResultsHere, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory.ConclusionWe conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.

show abstract

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens

Zhou¹,

Jiang²,

Bergquist³

et al. 2019

Preprint

View full text Add to dashboard Cite

The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. Here we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility (P. aureginosa only). We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. We conclude that, while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. We finally report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bioontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens. 157 project. Predicting GO terms for a protein (protein-centric) and predicting which proteins are associated 158 with a given function (term-centric) are related but different computational problems: the former is a 159 multi-label classification problem with a structured output, while the latter is a binary classification task. 160Predicting the results of a genome-wide screen for a single or a small number of functions fits the term-centric 161 formulation. To see how well all participating CAFA methods perform term-centric predictions, we mapped 162 results from the protein-centric CAFA3 methods onto these terms. In addition we held a separate CAFA 163 challenge, CAFA-π whose purpose was to attract additional submissions from algorithms that specialize in 164 term-centric tasks. 165 We performed screens for three functions in three species, which we then used to assess protein function 166 prediction. In the bacterium Pseudomonas aeruginosa and the fungus Candida albicans we performed 167 genome-wide screens capable of uncovering genes with two functions, biofilm formation (GO:0042710) and 168 motility (for P. aeruginosa only) (GO:0001539), as described in Methods. In Drosophila melanogaster we 169 performed targeted assays, guided by previous CAFA submissions, of a ...

show abstract

Developing Student Process Skills in a General Chemistry Laboratory

et al. 2019

View full text Add to dashboard Cite

Laboratory coursework is widely considered to be an integral part of chemistry undergraduate degree programs, although its impact on students’ chemistry knowledge is largely unsubstantiated. Laboratory experiences provide opportunities to learn skills beyond chemistry content knowledge, such as how to use scientific instrumentation appropriately, how to gather and analyze data, and how to work in a team. The acquisition of process skills, including critical thinking, problem solving, and communication, is an integral part of becoming a scientist and participating in the scientific community. As apprentice scientists, chemistry students interact with each other in a context-rich environment where the need for process skills can arise organically. This study seeks to understand the role of laboratory courses in developing process skills. Students in a first-year chemistry laboratory course used rubrics to assess their own process skills. During the course, the students also received feedback via rubrics from a teaching assistant trained in rubric use. Additionally, students reported their understanding of process skills and their perceived improvements over the course of the semester. Our results suggest that students understand group dynamics process skills such as teamwork and communication better than they understand cognitive process skills such as critical thinking and information processing. While the evidence further suggests that students improved their process skills, and students reported that they improved their process skills, they showed inconsistent abilities to self-assess and provide justification for their assessment using rubrics.

show abstract

Membrane expression of thymidine kinase 1 and potential clinical relevance in lung, breast, and colorectal malignancies

et al. 2018

View full text Add to dashboard Cite

BackgroundLung, breast, and colorectal malignancies are the leading cause of cancer-related deaths in the world causing over 2.8 million cancer-related deaths yearly. Despite efforts to improve prevention methods, early detection, and treatments, survival rates for advanced stage lung, breast, and colon cancer remain low, indicating a critical need to identify cancer-specific biomarkers for early detection and treatment. Thymidine kinase 1 (TK1) is a nucleotide salvage pathway enzyme involved in cellular proliferation and considered an important tumor proliferation biomarker in the serum. In this study, we further characterized TK1’s potential as a tumor biomarker and immunotherapeutic target and clinical relevance.MethodsWe assessed TK1 surface localization by flow cytometry and confocal microscopy in lung (NCI-H460, A549), breast (MDA-MB-231, MCF7), and colorectal (HT-29, SW620) cancer cell lines. We also isolated cell surface proteins from HT-29 cells and performed a western blot confirming the presence of TK1 on cell membrane protein fractions. To evaluate TK1’s clinical relevance, we compared TK1 expression levels in normal and malignant tissue through flow cytometry and immunohistochemistry. We also analyzed RNA-Seq data from The Cancer Genome Atlas (TCGA) to assess differential expression of the TK1 gene in lung, breast, and colorectal cancer patients.ResultsWe found significant expression of TK1 on the surface of NCI-H460, A549, MDA-MB-231, MCF7, and HT-29 cell lines and a strong association between TK1’s localization with the membrane through confocal microscopy and Western blot. We found negligible TK1 surface expression in normal healthy tissue and significantly higher TK1 expression in malignant tissues. Patient data from TCGA revealed that the TK1 gene expression is upregulated in cancer patients compared to normal healthy patients.ConclusionsOur results show that TK1 localizes on the surface of lung, breast, and colorectal cell lines and is upregulated in malignant tissues and patients compared to healthy tissues and patients. We conclude that TK1 is a potential clinical biomarker for the treatment of lung, breast, and colorectal cancer.Electronic supplementary materialThe online version of this article (10.1186/s12935-018-0633-9) contains supplementary material, which is available to authorized users.

show abstract

ShinyLearner: A containerized benchmarking tool for machine-learning classification of tabular data

Piccolo

Lee

Suh

et al. 2020

View full text Add to dashboard Cite

Background Classification algorithms assign observations to groups based on patterns in data. The machine-learning community have developed myriad classification algorithms, which are used in diverse life science research domains. Algorithm choice can affect classification accuracy dramatically, so it is crucial that researchers optimize the choice of which algorithm(s) to apply in a given research domain on the basis of empirical evidence. In benchmark studies, multiple algorithms are applied to multiple datasets, and the researcher examines overall trends. In addition, the researcher may evaluate multiple hyperparameter combinations for each algorithm and use feature selection to reduce data dimensionality. Although software implementations of classification algorithms are widely available, robust benchmark comparisons are difficult to perform when researchers wish to compare algorithms that span multiple software packages. Programming interfaces, data formats, and evaluation procedures differ across software packages; and dependency conflicts may arise during installation. Findings To address these challenges, we created ShinyLearner, an open-source project for integrating machine-learning packages into software containers. ShinyLearner provides a uniform interface for performing classification, irrespective of the library that implements each algorithm, thus facilitating benchmark comparisons. In addition, ShinyLearner enables researchers to optimize hyperparameters and select features via nested cross-validation; it tracks all nested operations and generates output files that make these steps transparent. ShinyLearner includes a Web interface to help users more easily construct the commands necessary to perform benchmark comparisons. ShinyLearner is freely available at https://github.com/srp33/ShinyLearner. Conclusions This software is a resource to researchers who wish to benchmark multiple classification or feature-selection algorithms on a given dataset. We hope it will serve as example of combining the benefits of software containerization with a user-friendly approach.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Erica Suh

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens

Developing Student Process Skills in a General Chemistry Laboratory

Membrane expression of thymidine kinase 1 and potential clinical relevance in lung, breast, and colorectal malignancies

ShinyLearner: A containerized benchmarking tool for machine-learning classification of tabular data

Contact Info

Product

Resources

About