Construction of FuzzyFind Dictionary using Golay Coding Transformation for Searching Applications

Kowsari, Kamran; Yammahi, Maryam; Bari, Nima; Vichr, Roman

doi:10.14569/ijacsa.2015.060313

Cited by 10 publications

(12 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Work in the information retrieval community has focused on search engine fundamentals such as indexing and dictionaries that are considered core technologies in this field [2]. Considerable work has built on these foundational methods to provide improvements through feedback and query reformulation [3], [4].…”

Section: Related Workmentioning

confidence: 99%

HDLTex: Hierarchical Deep Learning for Text Classification

Kowsari¹,

Brown²,

Heidarysafa³

et al. 2017

2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)

Self Cite

329

191

View full text Add to dashboard Cite

Increasingly large document collections require improved information processing methods for searching, retrieving, and organizing text. Central to these information processing methods is document classification, which has become an important application for supervised learning. Recently the performance of traditional supervised classifiers has degraded as the number of documents has increased. This is because along with growth in the number of documents has come an increase in the number of categories. This paper approaches this problem differently from current document classification methods that view the problem as multi-class classification. Instead we perform hierarchical classification using an approach we call Hierarchical Deep Learning for Text classification (HDLTex). HDLTex employs stacks of deep learning architectures to provide specialized understanding at each level of the document hierarchy.

show abstract

Section: Related Workmentioning

confidence: 99%

HDLTex: Hierarchical Deep Learning for Text Classification

Kowsari¹,

Brown²,

Heidarysafa³

et al. 2017

2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)

Self Cite

329

191

View full text Add to dashboard Cite

show abstract

“…The research goal for clustering techniques is to determine if we have more than one class labeled on one cluster, and what happens if we have no labeled data point in one cluster [210]. In this part, we briefly describe the most popular technique of semi-supervised text and document classification.…”

Section: Semi-supervised Learning For Text Classificationmentioning

confidence: 99%

Text Classification Algorithms: A Survey

et al. 2019

View full text Add to dashboard Cite

In recent years, there has been an exponential growth in the number of complex documents and texts that require a deeper understanding of machine learning methods to be able to accurately classify texts in many applications. Many machine learning approaches have achieved surpassing results in natural language processing. The success of these learning algorithms relies on their capacity to understand complex models and non-linear relationships within data. However, finding suitable structures, architectures, and techniques for text classification is a challenge for researchers. In this paper, a brief overview of text classification algorithms is discussed. This overview covers different text feature extractions, dimensionality reduction methods, existing algorithms and techniques, and evaluations methods. Finally, the limitations of each technique and their application in real-world problems are discussed.Spelling correction is an optional pre-processing step. Typos (short for typographical errors) are commonly present in texts and documents, especially in social media text data sets (e.g., Twitter). Many algorithms, techniques, and methods have addressed this problem in NLP [49]. Many techniques and methods are available for researchers including hashing-based and context-sensitive spelling correction techniques [50], as well as spelling correction using Trie and Damerau-Levenshtein distance bigram [51]. StemmingIn NLP, one word could appear in different forms (i.e., singular and plural noun form) while the semantic meaning of each form is the same [52]. One method for consolidating different forms of a word into the same feature space is stemming. Text stemming modifies words to obtain variant word forms using different linguistic processes such as affixation (addition of affixes) [53,54]. For example, the stem of the word "studying" is "study". LemmatizationLemmatization is a NLP process that replaces the suffix of a word with a different one or removes the suffix of a word completely to get the basic word form (lemma) [54][55][56]. Syntactic Word RepresentationMany researchers have worked on this text feature extraction technique to solve the loosing syntactic and semantic relation between words. Many researchers addressed novel techniques for solving this problem, but many of these techniques still have limitations. In [57], a model was introduced in which the usefulness of including syntactic and semantic knowledge in the text representation for the selection of sentences comes from technical genomic texts. The other solution for syntactic problem is using the n-gram technique for feature extraction. N-GramThe n-gram technique is a set of n-word which occurs "in that order" in a text set. This is not a representation of a text, but it could be used as a feature to represent a text.BOW is a representation of a text using its words (1-gram) which loses their order (syntactic). This model is very easy to obtain and the text can be represented through a vector, generally of a manageable size of the text. On the ...

show abstract

“…K-means clustering is one of the most popular clustering algorithms [30][31][32][33][34] for data in the form D ∈ {x 1 , x 2 , . .…”

Section: K-meansmentioning

confidence: 99%

HMIC: Hierarchical Medical Image Classification, A Deep Learning Approach

et al. 2020

Self Cite

View full text Add to dashboard Cite

Image classification is central to the big data revolution in medicine. Improved information processing methods for diagnosis and classification of digital medical images have shown to be successful via deep learning approaches. As this field is explored, there are limitations to the performance of traditional supervised classifiers. This paper outlines an approach that is different from the current medical image classification tasks that view the issue as multi-class classification. We performed a hierarchical classification using our Hierarchical Medical Image classification (HMIC) approach. HMIC uses stacks of deep learning models to give particular comprehension at each level of the clinical picture hierarchy. For testing our performance, we use biopsy of the small bowel images that contain three categories in the parent level (Celiac Disease, Environmental Enteropathy, and histologically normal controls). For the child level, Celiac Disease Severity is classified into 4 classes (I, IIIa, IIIb, and IIIC).

show abstract

Construction of FuzzyFind Dictionary using Golay Coding Transformation for Searching Applications

Cited by 10 publications

References 20 publications

HDLTex: Hierarchical Deep Learning for Text Classification

HDLTex: Hierarchical Deep Learning for Text Classification

Text Classification Algorithms: A Survey

HMIC: Hierarchical Medical Image Classification, A Deep Learning Approach

Contact Info

Product

Resources

About