This paper explores the use of texts that are related to an image collection, also known as collateral texts, for building thesauri in specialist domains to aid in image retrieval. Corpus linguistic and information extraction methods are used for identifying key terms and conceptual relationships in specialist texts that may be used for query expansion purposes. The specialist domain context imposes certain constraints on the language used in the texts, which makes the texts computationally more tractable. The effectiveness of such an approach is demonstrated through a prototype system that has been developed for the storage and retrieval of images and texts, applied in the forensic science domain.
Abstract-In this paper we explore the distribution of training of self-organised maps (SOM) on Grid middleware. We propose a two-level architecture and discuss an experimental methodology comprising ensembles of SOMs distributed over a Grid with periodic averaging of weights. The purpose of the experiments is to begin to systematically assess the potential for reducing the overall time taken for training by a distributed training regime against the impact on precision. Several issues are considered: (i) the optimum number of ensembles; (ii) the impact of different types of training data; and (iii) the appropriate period of averaging. The proposed architecture has been evaluated in a Grid environment, with clock-time performance recorded.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.