Motivation Species tree estimation from genes sampled from throughout the whole genome is complicated due to the gene tree-species tree discordance. Incomplete lineage sorting (ILS) is one of the most frequent causes for this discordance, where alleles can coexist in populations for periods that may span several speciation events. Quartet-based summary methods for estimating species trees from a collection of gene trees are becoming popular due to their high accuracy and statistical guarantee under ILS. Generating quartets with appropriate weights, where weights correspond to the relative importance of quartets, and subsequently amalgamating the weighted quartets to infer a single coherent species tree can allow for a statistically consistent way of estimating species trees. However, handling weighted quartets is challenging. Results We propose wQFM, a highly accurate method for species tree estimation from multi-locus data, by extending the quartet FM (QFM) algorithm to a weighted setting. wQFM was assessed on a collection of simulated and real biological datasets, including the avian phylogenomic dataset which is one of the largest phylogenomic datasets to date. We compared wQFM with wQMC, which is the best alternate method for weighted quartet amalgamation, and with ASTRAL, which is one of the most accurate and widely used coalescent-based species tree estimation methods. Our results suggest that wQFM matches or improves upon the accuracy of wQMC and ASTRAL. Availability wQFM is available in open source form at https://github.com/Mahim1997/wQFM-2020 Supplementary information Supplementary data are available at Bioinformatics online.
This article addresses the class imbalance issue in a low-resource language called Bengali. As a use-case, we choose one of the most fundamental NLP tasks, i.e., text classification, where we utilize three benchmark text corpora: fake-news dataset, sentiment analysis dataset, and song lyrics dataset. Each of them contains a critical class imbalance. We attempt to tackle the problem by applying several strategies that include data augmentation with synthetic samples via text and embedding generation in order to augment the proportion of the minority samples. Moreover, we apply ensembling of deep learning models by subsetting the majority samples. Additionally, we enforce the focal loss function for class-imbalanced data classification. We also apply the outlier detection technique, data resampling, and hidden feature extraction to improve the minority-f1 score. All of our experimentations are entirely focused on textual content analysis, which results in a more than 90% minority f1 score for each of the three tasks. It is an excellent outcome on such highly class-imbalanced datasets.
Abstract. Carbon nanotube (CNT) is considered as an ideal material for thermal management in electronic packaging because of its extraordinary high thermal conductivity. A series of 2D and 3D CFD simulations have been carried out for CNT based micro-channel cooling architectures based on one and two dimensional fin array in this paper using COMSOL 4.0a software. Micro-channels are generally regarded as an effective method for the heat transfer in electronic products. The influence of various fluids, micro-fin structures, micro-fin array, fluid velocity and heating powers on cooling effects have been simulated and compared in this study. Steady-state thermal stress analyses for the forced convection heat transfer have also been performed to determine maximum allowable stress and deflections for the different types of cooling assembly.
BackgroundR2 elements are a clade of early branching Long Interspersed Elements (LINEs). LINEs are retrotransposable elements whose replication can have profound effects on the genomes in which they reside. No crystal or EM structures exist for the reverse transcriptase (RT) and linker regions of LINEs.ResultsUsing limited proteolysis as a probe for globular domain structure, we show that the protein encoded by the Bombyx mori R2 element has two major globular domains: (1) a small globular domain consisting of the N-terminal zinc finger and Myb motifs, and (2) a large globular domain consisting of the RT, linker, and type II restriction-like endonuclease (RLE). Further digestion of the large globular domain occurred within the RT. Mapping these RT cleavages onto an updated model of the R2Bm RT indicated that the thumb of the RT was largely protected from proteolytic cleavage. The crystal structure of the large globular domain of Prp8, a eukaryotic splicing factor, was a major template used in building the R2Bm RT model, particularly the thumb region. The large fragment of Prp8 consists not only of a RT similar to R2Bm, but also an RLE and a linker connecting the two regions. The linker sequences adjacent to the RLE in LINEs and Prp8 share a set of two important α-helices and a (presumptive) knuckle/ββα structural motif that are closely associated with the thumb. The RLEs of LINEs and Prp8 share a unique catalytic core residue spacing as well as other key residues.ConclusionsThe protein encoded by RLE LINEs consists of two major globular domains. The larger of the two globular domain contains the RT, linker, and RLE and is similar to the large fragment of the spliceosomal protein Prp8. The similarities are suggestive of possible common ancestry.Electronic supplementary materialThe online version of this article (10.1186/s13100-017-0097-9) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.