We aimed to develop machine learning classifiers as a risk-prevention mechanism to help medical professionals with little or no knowledge of the patient’s languages in order to predict the likelihood of clinically significant mistakes or incomprehensible MT outputs based on the features of English source information as input to the MT systems. A MNB classifier was developed to provide intuitive probabilistic predictions of erroneous health translation outputs based on the computational modelling of a small number of optimised features of the original English source texts. The best performing multinominal Naïve Bayes classifier (MNB) using a small number of optimised features (8) achieved statistically higher AUC (M = 0.760, SD = 0.03) than the classifier using high-dimension natural features (135) (M = 0.631, SD = 0.006, p < 0.0001, SE = 0.004) and the automatically optimised classifier (22) (M = 0.7231, SD = 0.0084, p < 0.0001, SE = 0.004). Furthermore, MNB (8) had statistically higher sensitivity (M = 0.885, SD = 0.100) compared with the full-feature classifier (135) (M = 0.577, SD = 0.155, p < 0.0001, SE = 0.005) and the automatically optimised classifier (22) (M = 0.731, SD = 0.139, p < 0.0001, SE = 0.0023). Finally, MNB (8) reached statistically higher specificity (M = 0.667, SD = 0.138) compared to the full-feature classifier (135) (M = 0.567, SD = 0.139, p = 0.0002, SE = 0.026) and the automatically optimised classifier (22) (M = 0.633, SD = 0.141, p = 0.0133, SE = 0.026).
Background: Online mental health information represents important resources for people living with mental health issues. Suitability of mental health information for effective self-care remains understudied, despite the increasing needs for more actionable mental health resources, especially among young people. Objective: We aimed to develop Bayesian machine learning classifiers as data-based decision aids for the assessment of the actionability of credible mental health information for people with mental health issues and diseases. Methods: We collected and classified creditable online health information on mental health issues into generic mental health (GEN) information and patient-specific (PAS) mental health information. GEN and PAS were both patient-oriented health resources developed by health authorities of mental health and public health promotion. GENs were non-classified online health information without indication of targeted readerships; PASs were developed purposefully for specific populations (young, elderly people, pregnant women, and men) as indicated by their website labels. To ensure the generalisability of our model, we chose to develop a sparse Bayesian machine learning classifier using Relevance Vector Machine (RVM). Results: Using optimisation and normalisation techniques, we developed a best-performing classifier through joint optimisation of natural language features and min-max normalisation of feature frequencies. The AUC (0.957), sensitivity (0.900), and specificity (0.953) of the best model were statistically higher (p < 0.05) than other models using parallel optimisation of structural and semantic features with or without feature normalisation. We subsequently evaluated the diagnostic utility of our model in the clinic by comparing its positive (LR+) and negative likelihood ratios (LR−) and 95% confidence intervals (95% C.I.) as we adjusted the probability thresholds with the range of 0.1 and 0.9. We found that the best pair of LR+ (18.031, 95% C.I.: 10.992, 29.577) and LR− (0.100, 95% C.I.: 0.068, 0.148) was found when the probability threshold was set to 0.45 associated with a sensitivity of 0.905 (95%: 0.867, 0.942) and specificity of 0.950 (95% C.I.: 0.925, 0.975). These statistical properties of our model suggested its applicability in the clinic. Conclusion: Our study found that PAS had significant advantage over GEN mental health information regarding information actionability, engagement, and suitability for specific populations with distinct mental health issues. GEN is more suitable for general mental health information acquisition, whereas PAS can effectively engage patients and provide more effective and needed self-care support. The Bayesian machine learning classifier developed provided automatic tools to support decision making in the clinic to identify more actionable resources, effective to support self-care among different populations.
Background: Machine translation (MT) technologies have increasing applications in healthcare. Despite their convenience, cost-effectiveness, and constantly improved accuracy, research shows that the use of MT tools in medical or healthcare settings poses risks to vulnerable populations. Objectives: We aimed to develop machine learning classifiers (MNB and RVM) to forecast nuanced yet significant MT errors of clinical symptoms in Chinese neural MT outputs. Methods: We screened human translations of MSD Manuals for information on self-diagnosis of infectious diseases and produced their matching neural MT outputs for subsequent pairwise quality assessment by trained bilingual health researchers. Different feature optimisation and normalisation techniques were used to identify the best feature set. Results: The RVM classifier using optimised, normalised (L2 normalisation) semantic features achieved the highest sensitivity, specificity, AUC, and accuracy. MNB achieved similar high performance using the same optimised semantic feature set. The best probability threshold of the best performing RVM classifier was found at 0.6, with a very high positive likelihood ratio (LR+) of 27.82 (95% CI: 3.99, 193.76), and a low negative likelihood ratio (LR−) of 0.19 (95% CI: 0.08, 046), suggesting the high diagnostic utility of our model to predict the probabilities of erroneous MT of disease symptoms to help reverse potential inaccurate self-diagnosis of diseases among vulnerable people without adequate medical knowledge or an ability to ascertain the reliability of MT outputs. Conclusion: Our study demonstrated the viability, flexibility, and efficiency of introducing machine learning models to help promote risk-aware use of MT technologies to achieve optimal, safer digital health outcomes for vulnerable people.
We aimed to develop a quantitative instrument to assist with the automatic evaluation of the actionability of mental healthcare information. We collected and classified two large sets of mental health information from certified mental health websites: generic and patient-specific mental healthcare information. We compared the performance of the optimised classifier with popular readability tools and non-optimised classifiers in predicting mental health information of high actionability for people with mental disorders. sensitivity of the classifier using both semantic and structural features as variables achieved statistically higher than that of the binary classifier using either semantic (p < 0.001) or structural features (p = 0.0010). The specificity of the optimized classifier was statistically higher than that of the classifier using structural variables (p = 0.002) and the classifier using semantic variables (p = 0.001). Differences in specificity between the full-variable classifier and the optimised classifier were statistically insignificant (p = 0.687). These findings suggest the optimised classifier using as few as 19 semantic-structural variables was the best-performing classifier. By combining insights of linguistics and statistical analyses, we effectively increased the interpretability and the diagnostic utility of the binary classifiers to guide the development, evaluation of the actionability and usability of mental healthcare information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.