2020
DOI: 10.1177/1176935120917955
|View full text |Cite
|
Sign up to set email alerts
|

Preprocessing Breast Cancer Data to Improve the Data Quality, Diagnosis Procedure, and Medical Care Services

Abstract: In recent years, due to an increase in the incidence of different cancers, various data sources are available in this field. Consequently, many researchers have become interested in the discovery of useful knowledge from available data to assist faster decision-making by doctors and reduce the negative consequences of such diseases. Data mining includes a set of useful techniques in the discovery of knowledge from the data: detecting hidden patterns and finding unknown relations. However, these techniques face… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(6 citation statements)
references
References 26 publications
0
6
0
Order By: Relevance
“…Consequently, Data preprocessing before applying prediction process is inevitable to improve data quality. Moreover, data preprocessing is a stage in The Knowledge Discovery process which may that may require about 60% to 90% of the time necessary for knowledge discovery and contribute to 75% to 90% of the success of data mining cases [7]. Missing values may be expressed in the data as NaNs, blanks, undefined, or nulls.…”
Section: Data Preprocessingmentioning
confidence: 99%
“…Consequently, Data preprocessing before applying prediction process is inevitable to improve data quality. Moreover, data preprocessing is a stage in The Knowledge Discovery process which may that may require about 60% to 90% of the time necessary for knowledge discovery and contribute to 75% to 90% of the success of data mining cases [7]. Missing values may be expressed in the data as NaNs, blanks, undefined, or nulls.…”
Section: Data Preprocessingmentioning
confidence: 99%
“…where f(x) is the objective function, n is the total number of data samples, ∇f(x) is the gradient descent of objective function, and ∇f i (x) is the gradient descent that is computed for limited data samples randomly. Once the gradients are computed for the selected data samples, the weights are updated using the following equation [32]:…”
Section: Stochastic Gradient Descent (Sgd)mentioning
confidence: 99%
“…SMO is used for solving quadratic programming problems. It is used to train the SVM classifier using a Gaussian or polynomial kernel [32]. It converts the attributes or features into binary values and also replaces missing values in the input features.…”
Section: Sequential Minimal Optimization (Smo)mentioning
confidence: 99%
“…They make it possible for large numbers of professionals who agree to participate to contribute anonymised epidemiological data collected from their patients to a systematised database that organises, hierarchises and facilitates analysis of that data. Systematising data entry also makes it possible to determine data reliability of data and assess biases and the feasibility of obtaining relevant evidence [ 3 , 4 ].…”
Section: Introductionmentioning
confidence: 99%