COVID-19, an infectious disease caused by the SARS-CoV-2 virus, was declared a pandemic by the World Health Organisation (WHO) in March 2020. By mid-August 2020, more than 21 million people have tested positive worldwide.Infections have been growing rapidly and tremendous efforts are being made to fight the disease. In this paper, we attempt to systematise the various COVID-19 research activities leveraging data science, where we define data science broadly to encompass the various methods and tools-including those from artificial intelligence (AI), machine learning (ML), statistics, modeling, simulation, and data visualization-that can be used to store, process, and extract insights from data. In addition to reviewing Manuscript
<div>COVID-19, an infectious disease caused by the SARS-CoV-2 virus, was declared a pandemic by the World Health Organisation (WHO) in March 2020. At the time of writing, more than 2.8 million people have tested positive. Infections have been growing exponentially and tremendous efforts are being made to fight the disease. In this paper, we attempt to systematise ongoing data science activities in this area. As well as reviewing the rapidly growing body of recent research, we survey public datasets and repositories that can be used for further work to track COVID-19 spread and mitigation strategies.</div><div>As part of this, we present a bibliometric analysis of the papers produced in this short span of time. Finally, building on these insights, we highlight common challenges and pitfalls observed across the surveyed works.</div>
A content-centric network is one which supports host-to-content routing, rather than the host-to-host routing of the existing Internet. This paper investigates the potential of caching data at the router-level in content-centric networks. To achieve this, two measurement sets are combined to gain an understanding of the potential caching benefits of deploying content-centric protocols over the current Internet topology. The first set of measurements is a study of the BitTorrent network, which provides detailed traces of content request patterns. This is then combined with CAIDA's ITDK Internet traces to replay the content requests over a real-world topology. Using this data, simulations are performed to measure how effective content-centric networking would have been if it were available to these consumers/providers. We find that larger cache sizes (10,000 packets) can create significant reductions in packet path lengths. On average, 2.02 hops are saved through caching (a 20% reduction), whilst also allowing 11% of data requests to be maintained within the requester's AS. Importantly, we also show that these benefits extend significantly beyond that of edge caching by allowing transit ASes to also reduce traffic.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.