2017
DOI: 10.1007/978-3-319-58965-7_19
|View full text |Cite
|
Sign up to set email alerts
|

On Feature Weighting and Selection for Medical Document Classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(17 citation statements)
references
References 28 publications
0
15
0
2
Order By: Relevance
“…It has long been recognized that clinical reports are beneficial for secondary use. A number of researchers has deployed clinical data mining to mine useful information (such as medical concepts or medical entity) from clinical reports [27]. Various applications of clinical data mining include clinical information extraction [28], clinical relation extraction [29], clinical document clustering [30] and classification of clinical documents [7].…”
Section: Resultsmentioning
confidence: 99%
“…It has long been recognized that clinical reports are beneficial for secondary use. A number of researchers has deployed clinical data mining to mine useful information (such as medical concepts or medical entity) from clinical reports [27]. Various applications of clinical data mining include clinical information extraction [28], clinical relation extraction [29], clinical document clustering [30] and classification of clinical documents [7].…”
Section: Resultsmentioning
confidence: 99%
“…Another challenge is the dataset imbalance especially when trying to encode all 4-character codes as finding sufficient representatives for all classes is very difficult. Applying extreme text classification approaches to this task like splitting feature spaces or compressing label dimension helps reduce the imbalance effects [18].…”
Section: Related Workmentioning
confidence: 99%
“…The text classification corpus consists of 345,591 diagnoses and their corresponding ICD-10 codes. We apply data pre-processing on the diagnosis text by stemming using bulstempy 18 and removing the stop words using the BTB list 19 . We split the dataset using stratification and assign 80% for training, 10% for hyper-parameter tuning and 10% for testing using scikit-learn library 20 .…”
Section: Fine-tuning Bert For Icd-10 Classification Taskmentioning
confidence: 99%
See 1 more Smart Citation
“…Generally speaking the effectiveness of the classifier fully depends on the characteristics of the data to be categorized. Application of classification varies across different dimensions in the arena of engineering and science such as: text categorization [1], biological classification [2], natural language processing [3], document classification [4], internet search engine [5], pattern recognition [6], medical imaging [7], handwritten character recognition [8], micro-array classification [9], voice classification [10], gene expression classification [11]. Major classifiers available in the literature are Bayesian classifier [12], Support vector machine [13], K-nearest neighbor [14], Decision tree [15] and Artificial Neural Network [16].…”
Section: Introductionmentioning
confidence: 99%