BackgroundFollowing visible successes on a wide range of predictive tasks, machine learning techniques are attracting substantial interest from medical researchers and clinicians. We address the need for capacity development in this area by providing a conceptual introduction to machine learning alongside a practical guide to developing and evaluating predictive algorithms using freely-available open source software and public domain data.MethodsWe demonstrate the use of machine learning techniques by developing three predictive models for cancer diagnosis using descriptions of nuclei sampled from breast masses. These algorithms include regularized General Linear Model regression (GLMs), Support Vector Machines (SVMs) with a radial basis function kernel, and single-layer Artificial Neural Networks. The publicly-available dataset describing the breast mass samples (N=683) was randomly split into evaluation (n=456) and validation (n=227) samples.We trained algorithms on data from the evaluation sample before they were used to predict the diagnostic outcome in the validation dataset. We compared the predictions made on the validation datasets with the real-world diagnostic decisions to calculate the accuracy, sensitivity, and specificity of the three models. We explored the use of averaging and voting ensembles to improve predictive performance. We provide a step-by-step guide to developing algorithms using the open-source R statistical programming environment.ResultsThe trained algorithms were able to classify cell nuclei with high accuracy (.94 -.96), sensitivity (.97 -.99), and specificity (.85 -.94). Maximum accuracy (.96) and area under the curve (.97) was achieved using the SVM algorithm. Prediction performance increased marginally (accuracy =.97, sensitivity =.99, specificity =.95) when algorithms were arranged into a voting ensemble.ConclusionsWe use a straightforward example to demonstrate the theory and practice of machine learning for clinicians and medical researchers. The principals which we demonstrate here can be readily applied to other complex tasks including natural language processing and image recognition.Electronic supplementary materialThe online version of this article (10.1186/s12874-019-0681-4) contains supplementary material, which is available to authorized users.
The lean blow-off (LBO) limits and structure of turbulent premixed flames were investigated with vapourised liquid fuels stabilized by a bluff-body burner. Ethanol, heptane, and two kerosenes were used. In order to facilitate comparisons to gaseous-fueled flames, results were also obtained from methane flames. The measured LBO limits indicate that, for this burner, the ethanol and heptane flames are more resilient to blow-off than the kerosene fuels. Furthermore, a correlation based on a Damköhler number (Da), which is proportional to the laminar flame speed, does not lead to the successful collapse of the different fuels, indicating that the Da correlations based on laminar flame speed is not applicable. Average OH* chemiluminescence images of the ethanol and heptane flames are qualitatively similar to that from methane: the flame brushes of both exhibit an M-shape when close to blow-off. In contrast, the distribution of OH* signal in the kerosene flames is primarily concentrated in regions further downstream of the bluff body. Ultimately, the results of this effort highlight the influence fuel-type has on the LBO of bluff-body stabilized flames. Moreover, this work indicates the LBO behavior of flames produced with complex hydrocarbon fuels cannot be fully understood via high-temperature chemistry concepts such as the laminar flame speed. Turbulent premixed combustion, blow-off scaling, Vapourise kerosene
This work contains an analysis of the existence of critical phenomena in MILD combustion systems through an exploration of classical results from high-energy asymptotics theory for extinction conditions of non-premixed flames and well-stirred reactors. Through the derivation of an expression linking burning rate to Damköhler number, the criteria for a folded S-Shaped Curve, representative of a combustion system with sudden extinction and ignition behavior, was derived. This theory is discussed in detail, with particular focus on the limitations of the global chemistry it presents. The conditions reported by various previously-published numerical and experimental investigations are then discussed in the context of this theory. Of these investigations, those with the highest level of preheat and dilution had monotonic rather than folded S-Shaped Curves, indicating a lack of sudden extinction phenomena. It suggests that MILD combustion systems are those which lack sudden ignition and extinction behavior, therefore exhibiting a smooth, stretched S-Shaped Curve rather than a folded one with inflection points. The results suggest that the delineation between folded versus monotonic S-Shaped Curves may provide a useful alternative definition of MILD combustion.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.