Abstract-A new approach to support vector machine (SVM) classification is proposed wherein each of two data sets are proximal to one of two distinct planes that are not parallel to each other. Each plane is generated such that it is closest to one of the two data sets and as far as possible from the other data set. Each of the two nonparallel proximal planes is obtained by a single MATLAB command as the eigenvector corresponding to a smallest eigenvalue of a generalized eigenvalue problem. Classification by proximity to two distinct nonlinear surfaces generated by a nonlinear kernel also leads to two simple generalized eigenvalue problems. The effectiveness of the proposed method is demonstrated by tests on simple examples as well as on a number of public data sets. These examples show the advantages of the proposed approach in both computation time and test set correctness.
Two medical applications of linear programming are described in this paper. Specifically, linear programming-based machine learning techniques are used to increase the accuracy and objectivity of breast cancer diagnosis and prognosis. The first application to breast cancer diagnosis utilizes characteristics of individual cells, obtained from a minimally invasive fine needle aspirate, to discriminate benign from malignant breast lumps. This allows an accurate diagnosis without the need for a surgical biopsy. The diagnostic system in current operation at University of Wisconsin Hospitals was trained on samples from 569 patients and has had 100% chronological correctness in diagnosing 131 subsequent patients. The second application, recently put into clinical practice, is a method that constructs a surface that predicts when breast cancer is likely to recur in patients that have had their cancers excised. This gives the physician and the patient better information with which to plan treatment, and may eliminate the need for a prognostic surgical procedure. The novel feature of the predictive approach is the ability to handle cases for which cancer has not recurred (censored data) as well as cases for which cancer has recurred at a specific time. The prognostic system has an expected error of 13.9 to 18.3 months, which is better than prognosis correctness by other available techniques.
Multisurface pattern separation is a mathematical method for distinguishing between elements of two pattern sets. Each element of the pattern sets is comprised of various scalar observations. In this paper, we use the diagnosis of breast cytology to demonstrate the applicability of this method to medical diagnosis and decision making. Each of 11 cytological characteristics of breast fine-needle aspirates reported to differ between benign and malignant samples was graded 1 to 10 at the time of sample collection. Nine characteristics were found to differ significantly between benign and malignant samples. Mathematically, these values for each sample were represented by a point in a nine-dimensional space of real variables. Benign points were separated from malignant ones by planes determined by linear programming. Correct separation was accomplished in 369 of 370 samples (201 benign and 169 malignant). In the one misclassified malignant case, the fine-needle aspirate cytology was so definitely benign and the cytology of the excised cancer so definitely malignant that we believe the tumor was missed on aspiration. Our mathematical method is applicable to other medical diagnostic and decisionmaking problems.
A new approach to support vector machine (SVM) classification is proposed wherein each of two data sets are proximal to one of two distinct planes that are not parallel to each other. Each plane is generated such that it is closest to one of the two data sets and as far as possible from the other data set. Each of the two nonparallel proximal planes is obtained by a single MATLAB command as the eigenvector corresponding to a smallest eigenvalue of a generalized eigenvalue problem. Classification by proximity to two distinct nonlinear surfaces generated by a nonlinear kernel also leads to two simple generalized eigenvalue problems. The effectiveness of the proposed method is demonstrated by tests on simple examples as well as on a number of public data sets. These examples show the advantages of the proposed approach in both computation time and test set correctness.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.