Plectus murrayi is one of the most common and locally abundant invertebrates of continental Antarctic ecosystems. Because it is readily cultured on artificial medium in the laboratory and highly tolerant to an extremely harsh environment, P. murrayi is emerging as a model organism for understanding the evolutionary origin and maintenance of adaptive responses to multiple environmental stressors, including freezing and desiccation. The de novo assembled genome of P. murrayi contains 225.741 million base pairs and a total of 14,689 predicted genes. Compared to Caenorhabditis elegans, the architectural components of P. murrayi are characterized by a lower number of protein-coding genes, fewer transposable elements, but more exons, than closely related taxa from less harsh environments. We compared the transcriptomes of lab-reared P. murrayi with wild-caught P. murrayi and found genes involved in growth and cellular processing were up-regulated in lab-cultured P. murrayi, while a few genes associated with cellular metabolism and freeze tolerance were expressed at relatively lower levels. Preliminary comparative genomic and transcriptomic analyses suggest that the observed constraints on P. murrayi genome architecture and functional gene expression, including genome decay and intron retention, may be an adaptive response to persisting in a biotically simplified, yet consistently physically harsh environment.
In the context of genome assembly, the contig orientation problem is described as the problem of removing sufficient edges from the scaffold graph so that the remaining subgraph assigns a consistent orientation to all sequence nodes in the graph. This problem can also be phrased as a weighted MAX-CUT problem. The performance of MAX-CUT heuristics in this application is untested. We present a greedy heuristic solution to the contig orientation problem and compare its performance to a weighted MAX-CUT semi-definite programming heuristic solution on several graphs. We note that the contig orientation problem can be used to identify inverted repeats and inverted haplotypes, as these represent sequences whose orientation appears ambiguous in the conventional genome assembly framework.
This research employs an exhaustive search of different attribute selection algorithms in order to provide a more structured approach to learning design for prediction of Alzheimer's clinical dementia rating (CDR).
Optical character recognition (OCR) from newspaper page images is susceptible to noise due to degradation of old documents and variation in typesetting. In this report, we present a novel approach to OCR postcorrection. We cast error correction as a translation task, and fine-tune BART, a transformerbased sequence-to-sequence language model pretrained to denoise corrupted text. We are the first to use sentence-level transformer models for OCR post-correction, and our best model achieves a 29.4% improvement in character accuracy over the original noisy OCR text. Our results demonstrate the utility of pretrained language models for dealing with noisy text.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.