The paper describes the ideas and assumptions underlying the development of a new method for the evaluation and testing of interactive information retrieval (IR) systems, and reports on the initial tests of the proposed method. The method is designed to collect different types of empirical data, i.e. cognitive data as well as traditional systems performance data. The method is based on the novel concept of a 'simulated work task situation' or scenario and the involvement of real end users. The method is also based on a mixture of simulated and real information needs, and involves a group of test persons as well as assessments made by individual panel members. The relevance assessments are made with reference to the concepts of topical as well as situational relevance. The method takes into account the dynamic nature of information needs which are assumed to develop over time for the same user, a variability which is presumed to be strongly connected to the processes of relevance assessment.
This article introduces the concept of relevance as viewed and applied in the context of IR evaluation, by presenting an overview of the multidimensional and dynamic nature of the concept. The literature on relevance reveals how the relevance concept, especially in regard to the multidimensionality of relevance, is many faceted, and does not just refer to the various relevance criteria users may apply in the process of judging relevance of retrieved information objects. From our point of view, the multidimensionality of relevance explains why some will argue that no consensus has been reached on the relevance concept. Thus, the objective of this article is to present an overview of the many different views and ways by which the concept of relevance is used-leading to a consistent and compatible understanding of the concept. In addition, special attention is paid to the type of situational relevance. Many researchers perceive situational relevance as the most realistic type of user relevance, and therefore situational relevance is discussed with reference to its potential dynamic nature, and as a requirement for interactive information retrieval (IIR) evaluation.
This paper presents a set of basic components which constitutes the experimental setting intended for the evaluation of interactive information retrieval (IIR) systems, the aim of which is to facilitate evaluation of IIR systems in a way which is as close as possible to realistic IR processes. The experimental setting consists of three components: (1) the involvement of potential users as test persons; (2) the application of dynamic and individual information needs; and (3) the use of multidimensional and dynamic relevance judgements. Hidden under the information need component is the essential central sub-component, the simulated work task situation, the tool that triggers the (simulated) dynamic information needs. This paper also reports on the empirical findings of the metaevaluation of the application of this sub-component, the purpose of which is to discover whether the application of simulated work task situations to future evaluation of IIR systems can be recommended. Investigations are carried out to determine whether any search behavioural differences exist between test persons' treatment of their own real information needs versus simulated information needs. The hypothesis is that if no difference exists one can correctly substitute real information needs with simulated information needs through the application of simulated work task situations. The empirical results of the meta-evaluation provide positive evidence for the application of simulated work task situations to the evaluation of IIR systems. The results also indicate that tailoring work task situations to the group of test persons is important in motivating them. Furthermore, the results of the evaluation show that different versions of semantic openness of the simulated situations make no difference to the test persons' search treatment.
The present two-part article introduces matrix comparison as a formal means of evaluation in informetric studies such as cocitation analysis. In this first part, the motivation behind introducing matrix comparison to informetric studies, as well as two important issues influencing such comparisons, are introduced and discussed. The motivation is spurred by the recent debate on choice of proximity measures and their potential influence upon clustering and ordination results. The two important issues discussed here are matrix generation and the composition of proximity measures. The approach to matrix generation is demonstrated for the same data set, i.e., how data is represented and transformed in a matrix, evidently determines the behavior of proximity measures. Two different matrix generation approaches, in all probability, will lead to different proximity rankings of objects, which further lead to different ordination and clustering results for the same set of objects. Further, a resemblance in the composition of formulas indicates whether two proximity measures may produce similar ordination and clustering results. However, as shown in the case of the angular correlation and cosine measures, a small deviation in otherwise similar formulas can lead to different rankings depending on the contour of the data matrix transformed. Eventually, the behavior of proximity measures, that is whether they produce similar rankings of objects, is more or less dataspecific. Consequently, the authors recommend the use of empirical matrix comparison techniques for individual studies to investigate the degree of resemblance between proximity measures or their ordination results. In part two of the article, the authors introduce and demonstrate two related statistical matrix comparison techniques the Mantel test and Procrustes analysis, respectively. These techniques can compare and evaluate the degree of monotonicity between different proximity measures or their ordination results. As such, the Mantel test and Procrustes analysis can be used as statistical validation tools in informetric studies and thus help choosing suitable proximity measures.
The present two-part article introduces matrix comparison as a formal means for evaluation purposes in informetric studies such as cocitation analysis. In the first part, the motivation behind introducing matrix comparison to informetric studies, as well as two important issues influencing such comparisons, matrix generation, and the composition of proximity measures, are introduced and discussed. In this second part, the authors introduce and thoroughly demonstrate two related matrix comparison techniques the Mantel test and Procrustes analysis, respectively. These techniques can compare and evaluate the degree of monotonicity between different proximity measures or their ordination results. In common with these techniques is the application of permutation procedures to test hypotheses about matrix resemblances. The choice of technique is related to the validation at hand. In the case of the Mantel test, the degree of resemblance between two measures forecast their potentially different affect upon ordination and clustering results. In principle, two proximity measures with a very strong resemblance most likely produce identical results, thus, choice of measure between the two becomes less important. Alternatively, or as a supplement, Procrustes analysis compares the actual ordination results without investigating the underlying proximity measures, by matching two configurations of the same objects in a multidimensional space. An advantage of the Procrustes analysis though, is the graphical solution provided by the superimposition plot and the resulting decomposition of variance components. Accordingly, the Procrustes analysis provides not only a measure of general fit between configurations, but also values for individual objects enabling more elaborate validations. As such, the Mantel test and Procrustes analysis can be used as statistical validation tools in informetric studies and thus help choosing suitable proximity measures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.