Two complementary approaches to mapping network boundaries from traceroute paths recently emerged [27,31]. Both approaches apply heuristics to inform inferences extracted from traceroute measurement campaigns. bdrmap [27] used targeted traceroutes from a specific network, alias resolution probing techniques, and AS relationship inferences, to infer the boundaries of that specific network and the other networks attached at each boundary. MAPIT [31] tackled the ambitious challenge of inferring all AS-level network boundaries in a massive archived collection of traceroutes launched from many different networks. Both were substantial contributions to the state-of-the-art, and inspired a collaboration to explore the potential to combine the approaches. We present and evaluate bdrmapIT, the result of that exploration, which yielded a more complete, accurate, and general solution to this persistent and central challenge of Internet topology research. bdrmapIT achieves 91.8%-98.8% accuracy when mapping AS boundaries in two Internet-wide traceroute datasets, vastly improving on MAP-IT's coverage without sacrificing bdrmap's ability to map a single network. The bdrmapIT source code is available at https://git.io/fAsI0.
Geolocating Internet routers is a long-standing and notoriously difficult challenge, and current solutions lack the accuracy and adaptability to yield reliable results. We revisit this problem, designing a solution capable of accurately and comprehensively extracting geographic information that network operators embed into router interface hostnames. We train our system using dictionaries that map geographic codes to known locations, and constrain inferences with delay measurements conducted from a distributed set of vantage points. While most operators use known geographic codes, some devise their own mnemonic codes for locations, which our system also extracts and interprets. We evaluate our system on Internet-wide topology datasets, automatically learning regular expressions (regexes) for 1023 different domain suffixes with IPv4 routers, and 241 different domain suffixes with IPv6 routers. We received ground truth from operators of 13 domain suffixes, all of whom confirmed the correctness of our learned regexes, and that our system correctly interpreted 78.6% of the custom geographic codes. For these 13 suffixes, our solution more accurately extracts and interprets geographic information than the previous state-of-the-art, correctly geolocating 94.0% of router hostnames with a geohint compared to DRoP (56.6%) and HLOC (73.1%). This work advances the ability of researchers and network operators to characterize the location of critical Internet infrastructure, a foundational building block of network performance, security, and resilience analysis. We release the source code of our system and our inferred regexes.
No abstract
Mapping the Internet at scale is increasingly important to network security, failure diagnosis, and performance analysis, yet remains challenging. Accurately determining the interface addresses used for inter-AS links from traceroute traces can be hard because these interfaces are often assigned addresses from neighboring ASes. Identifying these interfaces can benefit Internet researchers and network diagnosticians by providing accurate IP-to-AS mappings where such mapping is most difficult-at AS boundaries. We describe a new algorithm, Multipass Accurate Passive Inferences from Traceroute (MAP-IT), for inferring the exact interface addresses used for point-topoint inter-AS links, as well as the specific ASes involved. MAP-IT combines evidence of an AS switch from distinct traceroute traces; using traceroute data makes it portable across IP networks. Each pass leverages prior inferences to refine existing inferences and to discover additional inter-AS link interfaces. We test MAP-IT with interface-level ground truth information from Internet2, achieving 100% precision. Using approximate ground truth from Level 3 and Telia-Sonera yields 95.0% precision. These results suggest that MAP-IT is sufficiently reliable for network diagnostics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.