Handling missing values is a crucial step in preprocessing data in Machine Learning. Most available algorithms for analyzing datasets in the feature selection process and classification or estimation process analyze complete datasets. Consequently, in many cases, the strategy for dealing with missing values is to use only instances with full data or to replace missing values with a mean, mode, median, or a constant value. Usually, discarding missing samples or replacing missing values by means of fundamental techniques causes bias in subsequent analyzes on datasets. Aim: Demonstrate the positive impact of multivariate imputation in the feature selection process on datasets with missing values. Results: We compared the effects of the feature selection process using complete datasets, incomplete datasets with missingness rates between 5 and 50%, and imputed datasets by basic techniques and multivariate imputation. The feature selection algorithms used are well-known methods. The results showed that the datasets imputed by multivariate imputation obtained the best results in feature selection compared to datasets imputed by basic techniques or non-imputed incomplete datasets. Conclusions: Considering the results obtained in the evaluation, applying multivariate imputation by MICE reduces bias in the feature selection process.
Background: For some time now, the effects of sound, noise, and music on the human body have been studied. However, despite research done through time, it is still not completely clear what influence, interaction, and effects sounds have on human body. That is why it is necessary to conduct new research on this topic. Thus, in this paper, a systematic review is undertaken in order to integrate research related to several types of sound, both pleasant and unpleasant, specifically noise and music. In addition, it includes as much research as possible to give stakeholders a more general vision about relevant elements regarding methodologies, study subjects, stimulus, analysis, and experimental designs in general. This study has been conducted in order to make a genuine contribution to this area and to perhaps to raise the quality of future research about sound and its effects over ECG signals.Methods: This review was carried out by independent researchers, through three search equations, in four different databases, including: engineering, medicine, and psychology. Inclusion and exclusion criteria were applied and studies published between 1999 and 2017 were considered. The selected documents were read and analyzed independently by each group of researchers and subsequently conclusions were established between all of them.Results: Despite the differences between the outcomes of selected studies, some common factors were found among them. Thus, in noise studies where both BP and HR increased or tended to increase, it was noted that HRV (HF and LF/HF) changes with both sound and noise stimuli, whereas GSR changes with sound and musical stimuli. Furthermore, LF also showed changes with exposure to noise.Conclusion: In many cases, samples displayed a limitation in experimental design, and in diverse studies, there was a lack of a control group. There was a lot of variability in the presented stimuli providing a wide overview of the effects they could produce in humans. In the listening sessions, there were numerous examples of good practice in experimental design, such as the use of headphones and comfortable positions for study subjects, while the listening sessions lasted 20 min in most of the studies.
Feature selection (FS) has attracted the attention of many researchers in the last few years due to the increasing sizes of datasets, which contain hundreds or thousands of columns (features). Typically, not all columns represent relevant values. Consequently, the noise or irrelevant columns could confuse the algorithms, leading to a weak performance of machine learning models. Different FS algorithms have been proposed to analyze highly dimensional datasets and determine their subsets of relevant features to overcome this problem. However, very often, FS algorithms are biased by the data. Thus, methods for ensemble feature selection (EFS) algorithms have become an alternative to integrate the advantages of single FS algorithms and compensate for their disadvantages. The objective of this research is to propose a conceptual and implementation framework to understand the main concepts and relationships in the process of aggregating FS algorithms and to demonstrate how to address FS on datasets with high dimensionality. The proposed conceptual framework is validated by deriving an implementation framework, which incorporates a set of Phyton packages with functionalities to support the assembly of feature selection algorithms. The performance of the implementation framework was demonstrated in several experiments discovering relevant features in the Sonar, SPECTF, and WDBC datasets. The experiments contrasted the accuracy of two machine learning classifiers (decision tree and logistic regression), trained with subsets of features generated either by single FS algorithms or the set of features selected by the ensemble feature selection framework. We observed that for the three datasets used (Sonar, SPECTF, and WD), the highest precision percentages (86.95%, 74.73%, and 93.85%, respectively) were obtained when the classifiers were trained with the subset of features generated by our framework. Additionally, the stability of the feature sets generated using our ensemble method was evaluated. The results showed that the method achieved perfect stability for the three datasets used in the evaluation.
It is well-known that age-related macular degeneration is the main cause of visual impairment in industrialised countries affecting around 239,000 people in UK while demonstrating an increasing trend. Fluorescein angiography is the most widely used technique in the diagnosis, prognosis and in following up the development of the disease. The interpretation of fluorescein angiograms depends upon the detection of abnormal fluorescence and clinical assessment of the functional integrity of retinal circulation, which is primarily based on subjective evaluation of the dye transit-time. This study presents a prototype system for quantitative analysis of the retinal haemodynamics. It analyses retinal blood flow based on the estimation of parameters such as mean transit-time, arteriovenous passage time and mean dye velocity. These parameters are estimated using densitometry and analysis of the vascular response. The prototype system consists of three main stages: vessel segmentation, image registration and blood flow estimation. The performance of the system is demonstrated on a comprehensive dataset, which contains images of normal retinas and retinas with pathologies such as wet age-related macular degeneration and branch retinal vein occlusion. The presented framework aims to demonstrate the technical feasibility for automated extraction of retinal blood flow parameters and contributes towards the development of a clinically proven computer vision system for quantitative analysis of fluorescein angiograms to assist NHS clinicians in the early diagnosis of age-related macular degeneration.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.