Literature recommender systems support users in filtering the vast and increasing number of documents in digital libraries and on the Web. For academic literature, research has proven the ability of citation-based document similarity measures, such as Co-Citation (CoCit), or Co-Citation Proximity Analysis (CPA) to improve recommendation quality. In this paper, we report on the first large-scale investigation of the performance of the CPA approach in generating literature recommendations for Wikipedia, which is fundamentally different from the academic literature domain. We analyze links instead of citations to generate article recommendations. We evaluate CPA, CoCit, and the Apache Lucene MoreLikeThis (MLT) function, which represents a traditional text-based similarity measure. We use two datasets of 779,716 and 2.57 million Wikipedia articles, the Big Data processing framework Apache Flink, and a ten-node computing cluster. To enable our large-scale evaluation, we derive two quasi-gold standards from the links in Wikipedia's "See also" sections and a comprehensive Wikipedia clickstream dataset. Our results show that the citation-based measures CPA and CoCit have complementary strengths compared to the text-based MLT measure. While MLT performs well in identifying narrowly similar articles that share similar words and structure, the citationbased measures are better able to identify topically related information, such as information on the city of a certain university or other technical universities in the region. The CPA approach, which consistently outperformed CoCit, is better suited for identifying a broader spectrum of related articles, as well as popular articles that typically exhibit a higher quality. Additional benefits of the CPA approach are its lower runtime requirements and its language-independence that allows for a cross-language retrieval of articles. We present a manual analysis of exemplary articles to demonstrate and discuss our findings.
Optical feeder links will become the extension of the terrestrial fiber communications towards space, increasing data throughput in satellite communications by overcoming the spectrum limitations of classical RF-links. The geostationary telecommunication satellite Alphasat and the satellites forming the EDRS-system will become the next generation for high-speed data-relay services. The ESA satellite ARTEMIS, precursor for geostationary orbit (GEO) optical terminals, is still a privileged experiment platform to characterize the turbulent channel and investigate the challenges of free-space optical communication to GEO. In this framework, two measurement campaigns were conducted with the scope of verifying the benefits of transmitter diversity in the uplink. To evaluate this mitigation technique, intensity measurements were carried out at both ends of the link. The scintillation parameter is calculated and compared to theory and, additionally, the Fried Parameter is estimated by using a focus camera to monitor the turbulence strength.In this paper, we present the results of two measurement campaigns, carried out during October 2012 and April 2013. The main scope of both campaigns was to analyze the transmitter diversity mitigation effect on the uplink scintillation. Experiments were also carried out to improve the tracking performance, reducing the receiver aperture. These measurement campaigns were possible thanks to the collaboration This paper is focused on characterization of the delay line that is deployed to create an uncorrelated second laser beam out of one common source. It is organized as follows: section II introduces the main theoretical background, section III describes the measurement setup, section IV presents the results and section V discusses the main conclusions.
We present Citolytics-a novel link-based recommendation system for Wikipedia articles. In a preliminary study, Citolytics achieved promising results compared to the widely used text-based approach of Apache Lucene's MoreLikeThis (MLT). In this demo paper, we describe how we plan to integrate Citolytics into the Wikipedia infrastructure by using Elasticsearch and Apache Flink to serve recommendations for Wikipedia articles. Additionally, we propose a large-scale online evaluation design using the Wikipedia Android app. Working with Wikipedia data has several unique advantages. First, the availability of a very large user sample contributes to statistically significant results. Second, the openness of Wikipedia's architecture allows making our source code and evaluation data public, thus benefiting other researchers. If link-based recommendations show promise in our online evaluation, a deployment of the presented system within Wikipedia would have a far-reaching impact on Wikipedia's more than 30 million users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.