2019
DOI: 10.22266/ijies2019.0831.23
|View full text |Cite
|
Sign up to set email alerts
|

A Novel Technique Using Multiple K-Shingling Based Weighted Dissimilarity Score for Web Content Outlier Mining

Abstract: The technological evolution of Internet and Web along with several applications leads to the problem of redundancy as the documents are unanimously forwarded and are stored in several servers and platforms. Recently, not only duplicate documents but also near-duplicate documents affect the performance of the search results. The main objective of this paper is to provide significant documents by eliminating the redundancy and near redundancy documents present in the web search results. The proposed model compri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2019
2019
2019
2019

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 17 publications
0
2
0
Order By: Relevance
“…Figure Followed by SURF local feature matching, pair of blocks containing matched SURF feature points is obtained and CNN features are extracted for corresponding blocks. In order to extract CNN features for the blocks, we used VGG19 [3], a popular pre-trained neural network model. It is currently the most preferred choice in the community for extracting features from images.…”
Section: Proposed Approachmentioning
confidence: 99%
See 1 more Smart Citation
“…Figure Followed by SURF local feature matching, pair of blocks containing matched SURF feature points is obtained and CNN features are extracted for corresponding blocks. In order to extract CNN features for the blocks, we used VGG19 [3], a popular pre-trained neural network model. It is currently the most preferred choice in the community for extracting features from images.…”
Section: Proposed Approachmentioning
confidence: 99%
“…Many duplicate or near duplicate documents exist on web. Search results are affected due to such redundancies exists among documents on the web [3]. In some cases focus may be to Matching neighbour region helps us to consider the portion of image that does not contain any local features.…”
Section: Introductionmentioning
confidence: 99%