Proceedings of the 11th International Conference on Enterprise Information 2009
DOI: 10.5220/0001982901300137
|View full text |Cite
|
Sign up to set email alerts
|

Term Weighting: Novel Fuzzy Logic Based Method vs. Classical Tf-Idf Method for Web Information Extraction

Abstract: Solving Term Weighting problem is one of the most important tasks for Information Retrieval and Information Extraction. Tipically, the TF-IDF method have been widely used for determining the weight of a term. In this paper, we propose a novel alternative fuzzy logic based method. The main advantage for the proposed method is the obtention of better results, especially in terms of extracting not only the most suitable information but also related information. This method will be used for the design of a Web Int… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2011
2011
2021
2021

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 12 publications
0
6
0
Order By: Relevance
“…Following, the basic formula TF * IDF (Bounabi et al , 2017) has known the additional factors, e.g. the length of document N. for normalization reasons and to adjust the formula according to the case study (Ropero et al , 2009). Different versions of TF-IDF were adopted in the following, and the probabilistic TF-IDF was mentioned in several works (Bounabi et al , 2018; Aizawa, 2003), which marked a significant improvement in the text classification field.…”
Section: Related Workmentioning
confidence: 99%
“…Following, the basic formula TF * IDF (Bounabi et al , 2017) has known the additional factors, e.g. the length of document N. for normalization reasons and to adjust the formula according to the case study (Ropero et al , 2009). Different versions of TF-IDF were adopted in the following, and the probabilistic TF-IDF was mentioned in several works (Bounabi et al , 2018; Aizawa, 2003), which marked a significant improvement in the text classification field.…”
Section: Related Workmentioning
confidence: 99%
“…Information extraction (IE) is the task of transformation a document collection into easier to analyze information [2], it tries to get relevant facts from documents. Whereas, Information retrieval (IR) deals with the representation, storage and organization of, and access to information items [3].…”
Section: Relate Workmentioning
confidence: 99%
“…In particular, a considerable amount of research has been devoted to discriminating between the representativeness of keywords, which represents the importance of the extracted keyword, thereby attaching a weight to the keyword [5 8]. The use of keywords with weight led to association rule mining being performed [911]; moreover, some research was performed to compare a fuzzy logic-based weight technique and TF-IDF (Term Frequency–Inverse Document Frequency) [1215]. In addition, other research applied the clustering method with fuzzy weighting to symbolic interval data [16].…”
Section: Related Workmentioning
confidence: 99%