Photo of a world map

Nautilus: Cross-Layer Cartography of the Undersea Internet Backbone

Picture of Alagappan Ramanathan
Guest Author | University of California Irvine
Categories:
Twitter logo
LinkedIn logo
Facebook logo
April 17, 2024

Submarine cables form the backbone of the Internet, carrying more than 99% of intercontinental traffic. However, this critical physical infrastructure is increasingly vulnerable to targeted attacks by malicious actors and accidental damage caused by human activities and natural disasters.

In March 2024 alone, two significant incidents underscored the fragility of global connectivity: Damage to three undersea cables in the Red Sea disrupted 25% of data traffic between Asia and Europe, while damage to four cables off the West Coast of Africa, currently attributed to undersea seismic activity, has led to daily economic losses amounting to hundreds of millions of dollars for many African countries.

Read: Major Internet Outages Across Western and Southern Africa

With an average of more than one hundred submarine cable failures occurring each year, the impact of these disruptions is further amplified at higher layers of the Internet. Multiple Autonomous Systems (ASes) often share a single cable, each with numerous IP links on the cable. Consequently, when a cable fails, all IP links relying on that cable are affected.

Despite the critical importance of submarine cables, we have a limited understanding of how their failures affect end-to-end Internet connectivity. To shed light on the impact of submarine cable infrastructure on the global Internet, my colleagues and I at the University of California, Irvine, have developed a comprehensive two-part solution.

In the first part of this blog series, I will introduce Nautilus, a cross-layer cartography framework designed to map IP links to their corresponding submarine cables. By providing a clearer picture of the relationships between these network layers, Nautilus lays the foundation for a more thorough analysis of the Internet’s resilience to submarine cable disruptions, including rapid impact assessment and proactive strategies during cable failures.

Cross-Layer Cartography with Nautilus

Identifying the specific submarine cables supporting a given IP layer path is complex and daunting. Nautilus addresses this challenge by decomposing IP layer paths into their constituent IP links and mapping each link individually based on two fundamental principles:

  • Geolocation: The geographic locations of IP link endpoints can reveal potential submarine cable routes based on proximity to cable landing stations. This principle relies on the assumption that IP links are more likely to utilize the nearest available submarine cable infrastructure.
  • Ownership: Nautilus also considers the ownership and operational relationships between submarine cable systems and the ASes associated with each IP link. An IP link is more likely to use a particular submarine cable if the corresponding AS owns or operates the cable system. This principle considers the economic and strategic incentives for ASes to prioritize using their own submarine cable infrastructure when routing traffic.

Employing these principles, Nautilus generates its cross-layer mapping through four key stages.

Stage 1: Link Classification

In the first stage, we classify IP links based on their likelihood of traversing a submarine cable. Nautilus relies on the geolocation of IP endpoints to classify links into three categories: definitely submarine, potentially submarine, and definitely terrestrial.

Stage 2: Geolocation (The Key Piece)

IP geolocation, which reveals the geographic locations of routers, can be correlated with nearby submarine cable landing stations to map IP links to their corresponding cables. However, city-level IP geolocation data can be inaccurate. To mitigate these inaccuracies, Nautilus employs a multi-pronged approach:

  • Using multiple geolocation sources to increase the likelihood of accurate results.
  • Validating geolocation data from Speed-of-Light (SoL) tests, which determine the minimum expected delay between two locations based on the speed of light in fiber optic cables. The geolocation is considered invalid if the reported delay is less than the SoL delay.
  • Applying clustering techniques to consolidate valid geolocations and reduce the number of candidate locations for each link endpoint.

After obtaining validated and clustered geolocation data, Nautilus correlates every combination of endpoint geolocations with the submarine cable map to identify potential cable candidates for each IP link.

Figure 1 illustrates the process by which Nautilus generates candidate cable sets for an IP link based on the geolocation data and the submarine cable map.

GIF animation illustrating the process by which Nautilus generates candidate cable sets for an IP link based on the geolocation data and the submarine cable map.
Figure 1 — Example of Nautilus cable candidates generation for an IP link using geolocation.

Stage 3: Ownership (The Refining Factor)

While geolocation helps identify potential cable matches, relying solely on this method often yields multiple candidates for a single IP link, particularly in regions with a high density of submarine cables. For instance, in Figure 1, the Taiwan-US link has five candidate cables, making it difficult to determine the cable used.

Nautilus leverages a previously unexplored cross-layer property to refine its predictions: ownership information that bridges the physical and network layers. Nautilus assumes that if an entity owns both the IP link endpoint and a nearby cable, the IP link is more likely to use that specific cable.

This innovative use of ownership information across layers allows Nautilus to effectively narrow down the candidate cable sets for each IP link, even in regions with a high concentration of submarine cables.

Stage 4: Finalizing Cable Choices

Nautilus uses a nuanced approach to refine its cable predictions by employing a weighted aggregation scheme combining geolocation and ownership data rather than using ownership as a simple binary filter. This scheme assigns a prediction score to each candidate cable, enabling the ranking and elimination of less probable cable matches.

The weighted aggregation scheme carefully balances the importance of geolocation and ownership information, assigning appropriate weights to each factor based on their relative significance in determining the likelihood of an IP link using a specific submarine cable. This approach allows Nautilus to make more informed decisions, even in cases where ownership information may be incomplete or ambiguous.

Figure 2 illustrates how Nautilus uses ownership information to assign scores and eliminate less probable cables to the example in Figure 1.

GIF animation illustrating how Nautilus uses ownership information to assign scores and eliminate less probable cables.
Figure 2 — Example of Nautilus using ownership to improve mapping.

A glance at Nautilus results

We generated a comprehensive submarine cross-layer map with Nautilus using a massive dataset—more than 235 million traceroutes collected by RIPE Atlas and CAIDA over 15 days in March 2022. These traceroutes yielded 8.9 million valid IP links (including IPv4 and IPv6), of which about 1.3 million are classified as definitely submarine links by Nautilus.

Figure 3 below illustrates the intercontinental distribution of definitely submarine IPv4 links.

Infographic showing the intercontinental distribution of definitely submarine IPv4 links.
Figure 3 — Inter-continental IPv4 link distribution. NA-EU makes up around 205 K links, followed by EU-EU and AS-AS (not depicted in the figure), which make up at least 90 K links each.

Nautilus successfully mapped 80% of IP links to submarine cables. On average, it mapped each IP link to 2.04 cables. The complete distribution is shown in Figure 4.

Column chart showing the number of links (in millions) that travel across x number of definite and potential submarine cables.
Figure 4 — The distribution of the number of cables predicted for links belonging to each category.

Validating Nautilus Mapping

To validate the accuracy of the map generated by Nautilus, we employ multiple techniques, one of which involves analyzing the impact of submarine cable failures. When a submarine cable segment fails, all IP links mapped to that segment should disappear from traceroutes. This provides an opportunity to evaluate Nautilus’s mapping accuracy by examining the visibility of mapped links during cable failure scenarios.

We tested this validation method using the four-day Falcon and SeaMeWe-5 cable outage in Yemen in 2022, which was caused by an airstrike. By analyzing traceroute data collected before, during, and after the outage, we observed a significant change in the visibility of links mapped to these cables. Approximately 100 links were visible in the traceroutes before and after the disruption. However, only five links mapped to these cables remained visible during the outage.

Notably, all five links belonged to the “potential submarine” category, indicating that Nautilus correctly identified their potential as terrestrial rather than actual submarine links. This real-world validation and others detailed in our paper demonstrate Nautilus’ mapping efficacy.

If you want to learn more about Nautilus, please read our paper, slated to appear at SIGMETRICS’24. The Nautilus codebase and results are open-sourced. Stay tuned to my website (linked below) for future updates about Nautilus.

In the next part of this blog series, I will introduce Xaminer. This resilience analysis tool uses a cross-layer map, such as the one generated by Nautilus, and a failure event model to identify the cross-layer impact on the Internet and extract failure patterns and trends.

Alagappan Ramanathan is a PhD student at the University of California Irvine and a 2023 Pulse Research Fellow.