Data classification is an automatic or semi-automatic process that, utilizing artificial intelligence algorithms, learns the variable and class relationships of a dataset for use a posteriori in situations where the class result is unknown. For many years, work on this topic has been aimed at increasing the hit rates of algorithms. However, when the problem is restricted to applications in healthcare, besides the concern with performance, it is also necessary to design algorithms whose results are understandable by the specialists responsible for making the decisions. Among the problems in the field of medicine, a current focus is related to COVID-19: AI algorithms may contribute to early diagnosis. Among the available COVID-19 data, the blood test is a typical procedure performed when the patient seeks the hospital, and its use in the diagnosis allows reducing the need for other diagnostic tests that can impact the detection time and add to costs. In this work, we propose using self-organizing map (SOM) to discover attributes in blood test examinations that are relevant for COVID-19 diagnosis. We applied SOM and an entropy calculation in the definition of a hierarchical, semi-supervised and explainable model named TESSOM (tree-based entropy-structured self-organizing maps), in which the main feature is enhancing the investigation of groups of cases with high levels of class overlap, as far as the diagnostic outcome is concerned. Framing the TESSOM algorithm in the context of explainable artificial intelligence (XAI) makes it possible to explain the results to an expert in a simplified way. It is demonstrated in the paper that the use of the TESSOM algorithm to identify attributes of blood tests can help with the identification of COVID-19 cases. It providing a performance increase in 1.489% in multiple scenarios when analyzing 2207 cases from three hospitals in the state of São Paulo, Brazil. This work is a starting point for researchers to identify relevant attributes of blood tests for COVID-19 and to support the diagnosis of other diseases.
Abstract. The discipline Programming Language in Computer Science courses has always been a great challenge for students and teachers. The interest in observe this discipline and analyze the interest of community can be used as a strategy of improving teaching and learning. This paper proposes to use the data from Stack Overflow forum, text mining techniques, and Self-Organizing Maps neural networks in order to discover insights of doubts from students and professionals who work with programming. This analysis can became a source of information to discovery, among other possibilities, to identify specific needs for programming disciplines, how demands of specialization courses, statistics about language use and doubts, and serve as a repository for searching information about programming languages.Resumo. A disciplina Linguagem de Programação em cursos de computação sempre foi um grande desafio para alunos e professores. O interesse em monitorar e avaliar o processo de ensino desta disciplina, e identificar vulnerabilidades,é de interesse amplo como estratégia de aprimoramento de ensino e aprendizagem. Este trabalho propõe que se use os dados disponíveis no fórum de dúvidas Stack Overflow, técnicas de text mining, e redes neurais do tipo Mapas Auto-Organizáveis, a fim de obter insights de dúvidas frequentes de estudantes e profissionais de programação. A análise destas informações podem, entre outras possibilidades, identificar carências no aprendizado de linguagem de programação, necessidades de cursos de especialização, sensu sobre uso de linguagens e dúvidas de linguagens especificas, ou seja, servir como repositório de informações sobre linguagens de programação.
The SINDEC's complaints database is an important source of information about the sectors and problems that cause complaints in Brazil. By analyzing these data, one can understand the main complaints and whether companies and institutions are working to solve such problems. The objective of this work is to apply data mining techniques, more specifically descriptive analyses and a classification algorithm, to extract knowledge from the data over the period 2013 and 2017.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.