Consumer wearables and sensors are a rich source of data about patients’ daily disease and symptom burden, particularly in the case of movement disorders like Parkinson’s disease (PD). However, interpreting these complex data into so-called digital biomarkers requires complicated analytical approaches, and validating these biomarkers requires sufficient data and unbiased evaluation methods. Here we describe the use of crowdsourcing to specifically evaluate and benchmark features derived from accelerometer and gyroscope data in two different datasets to predict the presence of PD and severity of three PD symptoms: tremor, dyskinesia, and bradykinesia. Forty teams from around the world submitted features, and achieved drastically improved predictive performance for PD status (best AUROC = 0.87), as well as tremor- (best AUPR = 0.75), dyskinesia- (best AUPR = 0.48) and bradykinesia-severity (best AUPR = 0.95).
Mobile health, the collection of data using wearables and sensors, is a rapidly growing field in health research with many applications. Deriving validated measures of disease and severity that can be used clinically or as outcome measures in clinical trials, referred to as digital biomarkers, has proven difficult. In part due to the complicated analytical approaches necessary to develop these metrics . Here we describe the use of crowdsourcing to specifically evaluate and benchmark features derived from accelerometer and gyroscope data in two different datasets to predict the presence of Parkinson's Disease (PD) and severity of three PD symptoms: tremor, dyskinesia and bradykinesia. 40 teams from around the world submitted features, and achieved drastically improved predictive performance for PD (best AUROC=0.87), as well as severity of tremor (best AUPR=0.75), dyskinesia (best AUPR=0.48) and bradykinesia (best AUPR=0.95).
Secondary structure and solvent accessibility prediction provide valuable information for estimating the three dimensional structure of a protein. As new feature extraction methods are developed the dimensionality of the input feature space increases steadily. Reducing the number of dimensions provides several advantages such as faster model training, faster prediction and noise elimination. In this work, several dimensionality reduction techniques have been employed including various feature selection methods, autoencoders and PCA for protein secondary structure and solvent accessibility prediction. The reduced feature set is used to train a support vector machine at the second stage of a hybrid classifier. Cross-validation experiments on two difficult benchmarks demonstrate that the dimension of the input space can be reduced substantially while maintaining the prediction accuracy. This will enable the incorporation of additional informative features derived for predicting the structural properties of proteins without reducing the accuracy due to overfitting.
No abstract
Özet-Günümüz teknolojisinde internetin her kesim tarafından çok yoğun olarak kullanılmasından dolayı insanlar artık görüş, fikir ve hislerini sosyal paylaşım siteleri, forum, blog benzeri birçok ortam aracılığı ile paylaşmaya başlamıştır. Ancak her geçen gün artan veri sayısı ve boyutu, bu verilerden manuel olarak anlamlı bilgiler çıkartılmasını çok zahmetli ve pahalı bir iş haline getirmektedir. Otomatik olarak verinin duygu içerip içermediğinin saptanması ve bu duygunun olumlu, olumsuz veya tarafsız olma durumunun belirlenmesi duygu analizi yardımıyla gerçekleştirilmektedir. Duygu düşünce analizinde, konuşma dilinin karmaşıklığı, değerlendirilen metin sayısının fazlalığı ve uzunluğu, çok sayıda gereksiz ve gürültü içeren öznitelik vektörüne neden olmaktadır. Boyut problemi olarak adlandırılan bu durum hesaplama zamanın artmasına ve sınıflama hatalarına yol açmaktadır. Bu çalışmada ise bahsedilen problemlere çözüm olarak önerilen derin öğrenme tabanlı oto kodlayıcı (Autoencoder) modeli ile gürültü giderici oto kodlayıcı (Denoising Autoencoder) modeli boyut düşürme tekniği olarak kullanılmış ve literatürde yaygın olarak kullanılan diğer boyut düşürme teknikleri ile kıyaslanmıştır. Elde edilen tüm veri setleri için sınıflama algoritması olarak Destek Vektör Makinaları ve Yapay Sinir Ağları kullanan farklı modeller geliştirilmiştir. Yapılan analizlerin sonucunda, boyut düşürme tekniklerinin duygu analizi için elde edilen sonuçları iyileştirdiği, önerilen oto kodlayıcı modellerinin ise var olan tekniklere benzer ya da onlardan daha iyi sonuçlar aldığı gözlemlenmiştir.Anahtar Kelimeler-Boyut düşürme, Oto kodlayıcı, Yapay sinir ağları, Destek vektör makineleri, Duygu analizi, Derin öğrenme Comparison of Feature Reduction Methods with Deep Autoencoder Machine Learning in Sentiment AnalysisAbstract-Because the internet is extensively used by people from all strata with today's technology, people now share their opinions, ideas and feelings through a variety of media such as social networking sites, forums and blogs. However, the number and size of data that is increasing day by day makes it very laborious and expensive to extract meaningful information manually from these data. Determination of whether data includes emotions or not automatically and determination of these feelings being positive, negative and neutral are performed by sentiment analysis. In sentiment analysis, the complexity of the speech language, the excessive number and length of texts being evaluated causes a large number of unnecessary and noise-containing feature vectors. This situation, which is called dimensionality problem, leads to increase of computation time and classification errors. In this study, a deep autoencoder model and a denoising autoencoder model are proposed and used as dimension reduction methods to overcome mentioned problems and compared with other feature reduction methods commonly used in literature. For all data sets obtained, different models have been developed using Support Vector Machines and Artificial Neural Networks ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.