Now a days, number of Web Users accessing information over Internet is increasing day by day. A huge amount of information on Internet is available in different language that can be access by anybody at any time. Information Retrieval (IR) deals with finding useful information from a large collection of unstructured, structured and semi-structured data. Information Retrieval can be classified into different classes such as monolingual information retrieval, cross language information retrieval and multilingual information retrieval (MLIR) etc. In the current scenario, the diversity of information and language barriers are the serious issues for communication and cultural exchange across the world. To solve such barriers, cross language information retrieval (CLIR) system, are nowadays in strong demand. CLIR refers to the information retrieval activities in which the query or documents may appear in different languages. This paper takes an overview of the new application areas of CLIR and reviews the approaches used in the process of CLIR research for query and document translation. Further, based on available literature, a number of challenges and issues in CLIR have been identified and discussed.
Abstract-Cross-Language Information Retrieval (CLIR) is a most demanding research area of Information Retrieval (IR) which deals with retrieval of documents different from query language. In CLIR, translation is an important activity for retrieving relevant results. Its goal is to translate query or document from one language into another language. The correct translation of the query is an essential task of CLIR because incorrect translation may affect the relevancy of retrieved results.The purpose of this paper is to compute the accuracy of query translation using the back translation for a HindiEnglish CLIR system. For experimental analysis, we used FIRE-2011 dataset to select Hindi queries. Our analysis shows that back translation can be effective in improving the accuracy of query translation of the three translators used for analysis (i.e. Google, Microsoft and Babylon). Google is found best for the purpose.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.