Summary
The standard information retrieval systems mainly extract documents based on the relative keywords and this method is not effective since the related information can be only identified by extracting the semantics present in the text. Hence to overcome this problem, a semantic‐based query retrieval model is formulated in this article for query processing, identification, extraction, expansion, sorting, and filtering. A keyword expansion approach using the hybrid spatial bound whale optimization algorithm‐binary moth flame optimization algorithm is proposed in this work. In this way, the complexity associated with insufficient query information is overcome. The efficiency of the proposed model in extracting the top k relevant information is evaluated using the experiments conducted in the TREC data using different performance metrics such as precision@k, recall@k, mean reciprocal rank@k, mean average precision, and normalized discounted cumulative gain (NDCG@k). The proposed model achieves improved outcomes when compared to different state‐of‐art techniques such as i‐Dataquest, fuzzy logic, ontological framework for information extraction, and personal knowledge management in terms of precision@k, recall@k, mean reciprocal rank@k, exact match, F1‐score, and NDCG@k. The proposed model gives a mean reciprocal rank@k score of 0.8901, 0.8947, 0.8958, and 0.9014 for the k‐values 1, 3, 5, and 10, respectively. The MAP@k score for the top‐10 result suggestion retrieved is 0.39056. The exact match score of the proposed model for the newsgroup, SQuAD 1.0, and SQUAD 2.0 is 93.25, 94.65, and 95.25.