This paper describes the system presented by the LABDA group at SemEval 2017 Task 10 ScienceIE, specifically for the subtasks of identification and classification of keyphrases from scientific articles. For the task of identification, we use the BANNER tool, a named entity recognition system, which is based on conditional random fields (CRF) and has obtained successful results in the biomedical domain. To classify keyphrases, we study the UMLS semantic network and propose a possible linking between the keyphrase types and the UMLS semantic groups. Based on this semantic linking, we create a dictionary for each keyphrase type. Then, a feature indicating if a token is found in one of these dictionaries is incorporated to feature set used by the BANNER tool. The final results on the test dataset show that our system still needs to be improved, but the conditional random fields and, consequently, the BAN-NER system can be used as a first approximation to identify and classify keyphrases.
Leckrone et al. reported the presence of double-ionized thallium (Tl iii) in the stellar atmosphere of the chemically peculiar star, χ Lupi, in 1999. Two spectral lines at 1332.3 and 1558.6 Å were detected in its stellar spectrum. Here, we claim that there are ions and lines that need further study to make possible the LTE abundance analysis once improved atomic data is available. In this sense, Tl iii is included in the ions list that requires improved atomic data. To contribute to the solution of this problem, an analysis of the atomic parameters of the double-ionized thallium is presented here. In this work, calculations of transition probabilities were made to obtain values of the theoretical radiative lifetimes and theoretical Stark parameters for widths and shifts of 22 spectral lines of Tl iii. Relativistic Hartree–Fock calculations using Cowan’s code allowed us to obtain the required transition probabilities. In our calculations, the core polarization effects were included. Later, the Griem semiempirical approach was used to obtain the Stark parameters. Our lifetimes values for 10 levels were compared with the experimental ones. In this paper, we discuss the behavior of the Stark parameters versus the temperature of three relevant spectral lines. Stark width values of the isoelectronic sequence Tl iii–Pb iv are also displayed.
<div>Sentiment analysis has become a very popular research topic and covers a wide range of domains such as economy, politics and health. In the pharmaceutical field, automated analysis of online user reviews provides information on the effectiveness and potential side effects of drugs, which could be used to improve pharmacovigilance systems. Deep learning approaches have revolutionized the field of Natural Language Processing (NLP), achieving state-of-the-art results in many tasks, such as sentiment analysis.</div><div>These methods require large annotated datasets to train their models. However, in most real-world scenarios, obtaining high-quality labeled datasets is an expensive and time-consuming task. In contrast, unlabeled texts task can be, generally, easily obtained. </div><div>In this work, we propose a semi-supervised approach based on a Semi-Supervised Generative Adversarial Network (SSGAN) to address the lack of labeled data for the sentiment analysis of drug reviews, and improve the results provided by supervised approaches in this task.</div><div>To evaluate the real contribution of this approach, we present a benchmark comparison between our semi-supervised approach and a supervised approach, which uses a similar architecture but without the generative adversal setting. </div><div>Experimental results show better performance of the semi-supervised approach when annotated reviews are less than ten percent of the training set, obtaining a significant improvement for the classification of neutral reviews, the class with least examples. To the best of our knowledge, this is the first study that applies a SSGAN to the sentiment classification of drug reviews. Our semi-supervised approach provides promising results for dealing with the shortage of annotated dataset, but there is still much room to improvement.</div>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.