This paper describes the participation of LIG lab, in the batch filtering task for the INFILE (INformation FILtering Evaluation) campaign of CLEF 2009. As opposed to the online task, where the server provides the documents one by one, all of the documents are provided beforehand in the batch task, which explains the fact that feedback is not possible in the batch task. We propose in this paper a batch algorithm to learn category specific thresholds in a multiclass environment where a document can belong to more than one class. The algorithm uses k-nearest neighbor algorithm for filtering the 100,000 documents into 50 topics. The experiments were run on the English corpus. Our experiments gave us a precision of 0.256 while the recall was 0.295. We had participated in the online task in INFILE 2008 where we had used an online algorithm using the feedbacks from the server.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.