Maximum Inner Product Search (MIPS) plays an essential role in many applications ranging from information retrieval, recommender systems to natural language processing. However, exhaustive MIPS is often expensive and impractical when there are a large number of candidate items.
The state-of-the-art quantization method of approximated MIPS is product quantization with a score-aware loss, developed by assuming that queries are uniformly distributed in the unit sphere. However, in real-world datasets, the above assumption about queries does not necessarily hold.
To this end, we propose a quantization method based on the distribution of queries combined with sampled softmax.
Further, we introduce a general framework encompassing the proposed method and multiple quantization methods, and we develop an effective optimization for the proposed general framework. The proposed method is evaluated on three real-world datasets. The experimental results show that it outperforms the state-of-the-art baselines.