M. Yang scite author profile

M. Yang

3Publications

8Citation Statements Received

8Citation Statements Given

How they've been cited

How they cite others

Affiliations

Guangzhou University of Chinese Medicine, Beijing Jiaotong University, IQVIA (France)

Publications

Order By: Most citations

Machine Learning Algorithm Helps Identify Nondiagnosed Prodromal Alzheimer’s Disease Patients in the General Population

et al. 2019

View full text Add to dashboard Cite

Background: Recruiting patients for clinical trials of potential therapies for Alzheimer’s disease (AD) remains a major challenge, with demand for trial participants at an all-time high. The AD treatment R&D pipeline includes around 112 agents. In the United States alone, 150 clinical trials are seeking 70,000 participants. Most people with early cognitive impairment consult primary care providers, who may lack time, diagnostic skills and awareness of local clinical trials. Machine learning and predictive analytics offer promise to boost enrollment by predicting which patients have prodromal AD, and which will go on to develop AD. Objectives: The authors set out to develop a machine learning predictive model that identifies prodromal AD patients in the general population, to aid early AD detection by primary care physicians and timely referral to expert sites for biomarker confirmation of diagnosis and clinical trial enrollment. Design: The authors use a classification machine learning algorithm to extract patterns within healthcare claims and prescription data three years prior to AD diagnosis/AD drug initiation. Setting: The study focused on subjects included within proprietary IQVIA US data assets (claims and prescription databases). Patient information was extracted from January 2010 to July 2018, for cohorts aged between 50 and 85 years. Participants: A total of 88,298,289 subjects aged between 50 and 85 years were identified. For the positive cohort, 667,288 subjects were identified who had 24 months of medical history and at least one record with AD or AD treatment. For the negative cohort, 3,670,254 patients were selected who had a similar length of medical history and who were matched to positive cohort subjects based on the prevalence rate. The scoring cohort was selected based on availability of recent medical data of 2-5 years and included 72,670,283 subjects between the ages of 50 and 85 years. Intervention (if any): None. Measurements: A list of clinically-relevant and interpretable predictors was generated and extracted from the data sets for each subject, including pharmacological treatments (NDC/product), office/specialist visits (specialty), tests and procedures (HCPCS and CPT), and diagnosis (ICD). The positive cohort was defined as patients who have AD diagnosis/AD treatment with a 3 years offset as an estimate for prodromal AD diagnosis. Supervised ML techniques were used to develop algorithms to predict the occurrence of prodromal AD cases. The sample dataset was divided randomly into a training dataset and a test dataset. The classification models were trained and executed in the PySpark framework. Training and evaluation of LogisticRegression, DecisionTreeClassifier, RandomForestClassifier, and GBTClassifier were executed using PySpark’s mllib module. The area under the precision-recall curve (AUCPR) was used to compare the results of the various models. Results: The AUCPRs are 0.426, 0.157, 0.436, and 0.440 for LogisticRegression, DecisionTreeClassifier, RandomForestClassifier, and GBTClassifier, respectively, meaning that GBTClassifier (Gradient Boosted Tree) outperforms the other three classifiers. The GBT model identified 222,721 subjects in the prodromal AD stage with 80% precision. Some 76% of identified prodromal AD patients were in the primary care setting. Conclusions: Applying the developed predictive model to 72,670,283 U.S. residents, 222,721 prodromal AD patients were identified, the majority of whom were in the primary care setting. This could drive major advances in AD research by enabling more accurate and earlier prodromal AD diagnosis at the primary care physician level , which would facilitate timely referral to expert sites for in-depth assessment and potential enrolment in clinical trials.

show abstract

Research on the Design of Active Learning Algorithm based on Query-by-Committee for Intelligent Fetal Monitoring

Quan

Yang

et al. 2021

View full text Add to dashboard Cite

RDF Data Query and Management Method Based on HBase and Structure Index in Railway Sensor Application

Yang

Zhang

2013

View full text Add to dashboard Cite

Railway dangerous goods tracing is a typical application of the sensor network. Application correlation among the sensor, carriage and train is a graph relationship which can be described by using RDF frameworks. It requires data management system to manage a large scale of ever-increasing RDF data, and support semantic access for monitoring the safety state of the environment inside the carriage. For these problems, this paper proposes RDF data query and management method based on HBase and structure index, and optimization method of query engine. The method is enforced by rewriting SPARQL statements according of correlation degree between them, and at querying time, "structure-level" index is used to identify the groups of RDF data, then the "data-level" data matching utilizes the proposed scalable storage mechanism based on hash-oriented multiple table partition of data entity class. As shown in our experiments, our approach can effectively reduce the semantic query time, enhance storage scalability and effective support multi-criteria query of sensor data.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

M. Yang

Machine Learning Algorithm Helps Identify Nondiagnosed Prodromal Alzheimer’s Disease Patients in the General Population

Research on the Design of Active Learning Algorithm based on Query-by-Committee for Intelligent Fetal Monitoring

RDF Data Query and Management Method Based on HBase and Structure Index in Railway Sensor Application

Contact Info

Product

Resources

About