Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval 2017
DOI: 10.1145/3077136.3080816
|View full text |Cite
|
Sign up to set email alerts
|

Variational Deep Semantic Hashing for Text Documents

Abstract: As the amount of textual data has been rapidly increasing over the past decade, efficient similarity search methods have become a crucial component of large-scale information retrieval systems. A popular strategy is to represent original data samples by compact binary codes through hashing. A spectrum of machine learning methods have been utilized, but they o en lack expressiveness and flexibility in modeling to learn effective representations. e recent advances of deep learning in a wide range of applications… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

5
103
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 60 publications
(108 citation statements)
references
References 27 publications
5
103
0
Order By: Relevance
“…This phenomenon is especially obvious for VDSH, in which the precisions on all three datasets drop by a significant margin. This interesting phenomenon has been reported in previous works Chaidaroon and Fang, 2017;Wang et al, 2013;Liu et al, 2012), and the reason could be overfitting since the model with long hashing codes is more likely to overfitting (Chaidaroon and Fang, 2017;. However, it can be seen that our model is more robust to the number of hashing bits.…”
Section: Performance Evaluation Of Unsupervisedsupporting
confidence: 82%
See 3 more Smart Citations
“…This phenomenon is especially obvious for VDSH, in which the precisions on all three datasets drop by a significant margin. This interesting phenomenon has been reported in previous works Chaidaroon and Fang, 2017;Wang et al, 2013;Liu et al, 2012), and the reason could be overfitting since the model with long hashing codes is more likely to overfitting (Chaidaroon and Fang, 2017;. However, it can be seen that our model is more robust to the number of hashing bits.…”
Section: Performance Evaluation Of Unsupervisedsupporting
confidence: 82%
“…Let x ∈ Z |V | + denote the bag-of-words representation of a document and x i ∈ {0, 1} |V | denote the one-hot vector representation of the i-th word of the document, where |V | denotes the vocabulary size. VDSH in (Chaidaroon and Fang, 2017) proposed to model a document D, which is de-fined by a sequence of one-hot word representa-…”
Section: Preliminaries On Generative Semantic Hashingmentioning
confidence: 99%
See 2 more Smart Citations
“…This can be done using techniques similar to Latent Semantic Indexing [33], spectral clustering [29], or two-step approaches of first creating an optimal encoding and then training a classifier to predict this [34]. Recent work has focused on deep learning based methods [4,5,23] to create a generative document model. However, none of the methods directly model the end goal of providing an effective similarity search, i.e., being able to accurately rank documents based on their hash codes, but rather just focus solely on generating document representations.…”
Section: Introductionmentioning
confidence: 99%