Photo of several mountain ranges of varying heights

Developing a Holistic approach to measuring Internet Outages 

Picture of Mike Vandersanden
Guest Author | Hasselt University and Pulse Research Fellow
Categories:
Twitter logo
LinkedIn logo
Facebook logo
December 5, 2024
In short
  • A Pulse Research Fellow has developed a new Internet shutdown tracking system that retrieves open-source, API data from several sources to provide higher resolution of Internet traffic measurements, including during anomalies.
  • Testing shows that each measurement system provides varying results, but collectively they can help with interpreting the cause of the anomalies.
  • More data sources and an intuitive interfaces to automate the analysis process are being worked on.

Measuring the Internet is difficult. Each measurement is highly subjective, depending on the data you can access and how you interpret it. One way to overcome these challenges is to collate data from multiple sources to form a holistic understanding of Internet connectivity when there is and isn’t a significant event. 

As a 2024 Pulse Research fellow, I’ve used this approach to extract meaningful correlations about Internet shutdowns.

Register for the 2024 Pulse Research Review Webinar on 11 June from 15:00 UTC

Collating Multiple Sources of Data Provides Greater Resolution

The holistic system I’ve developed allows users to retrieve data from multiple trusted data sources monitoring the Internet, including: 

The system allows you to layer data from these sources for a select period to compare each vantage point.

When investigating known deliberate Internet disruptions, it is apparent that not all data sources show a decrease in quality when an outage occurs. For example, Figure 1 shows connectivity data for Algeria from IODA and Google from 9 to 13 June 2024. The highlighted sections are annotations from Cloudflare Radar of reported government-directed disruptions, the timing of which corresponds with previous government orders to restrict Internet connectivity in the country during its Baccalaureate exams. You can see these shutdown events on the Pulse Internet Shutdown Tracker.

Time series line graph showing various Internet measurements for Algeria from 9 to 13 June.
Figure 1 – An example of connectivity data and annotations sourced from Cloudflare Radar, Google, and IODA for Algeria from 9 to 13 June. The highlighted sections are Internet shutdowns reported by Cloudflare Radar. 

Figure 2 is another example of how users can collate and annotate data from OONI and CitizenLab. It groups Internet services by category to show which categories experience the most anomalies, in this case, News Media and Social Networking.

Column graph showing the anomaly count for various web service categories.
Figure 2 – An example of OONI Web Connectivity Test data being categorized using CitizenLab’s test list. Note that the OONI Web Connectivity Test does provide a breakdown of categories online but not through its API. 

Based on these graphs, we can note the following: 

  • Predictable outage time and duration—The outages appear according to a pattern around the exam moments, where the disruption is supposed to counteract potential cheating. We can see this in the timeline, where outages are annotated, as well as through the raw data and by using anomaly detection. 
  • No complete outage—The raw data shows that, usually, while the Internet is disrupted, it is only partial. 
  • Similar site categories that experience the outage—By looking at the specific sites being disrupted or showing anomalies, we can observe that all exam periods show similar types of websites being affected. As the reasoning behind the different outages is the same, the same information can be expected to be impacted. 
  • Different methods of disruption—The outages appear in various data sources for every region, hinting that different methods are being used to disrupt the Internet. 

Besides looking at outages, it is essential to investigate periods that don’t have an outage annotation. These periods will show what is supposed to be normal behavior and unveil even more outage periods that still need annotation. In the example shown, we can observe a handful of potential outages on 9 June, before the first annotated outage occurs, as these periods show similar drops in several data sources. 

Next Steps 

We will continue to refine the system to improve its accuracy and user experience for professional and amateur data analysts. This includes: 

  • Incorporate additional data sources, including from the Internet Society Pulse API, to provide greater insight into the impact of outages.
  • Adding intuitive interfaces to automate the analysis process.

Analyzing Internet outages is an ongoing challenge but essential for understanding the causes and improving Internet resilience. We aim to improve our understanding and reporting of Internet shutdowns and other outages by employing a holistic approach with multiple vantage points. 

Applications for the 2025 Pulse Research Fellowship and Mentorship are now open. Learn more

Mike Vandersanden is a PhD student at Hasselt University and a 2024 Pulse Research Fellow.