Internet Resilience Index Methodology
Introduction
About the Index
The Internet plays a critical role in society today. Unfortunately, not all countries are on a level playing field with regards to a resilient Internet infrastructure. Many low-income countries have under-provisioned networks and cable infrastructure, or they lack redundant interconnection systems. In these countries (or regions), the likelihood of Internet outages occurring is much higher than in other countries.
Measuring Internet resilience is not an easy task as there are several building blocks underpinning the Internet’s complex infrastructure. Additionally, the Internet landscape varies considerably around the world and to be able to objectively compare countries - on a common ground - there needs to be an objective set of metrics that track and record the different components that contribute to the resiliency of the Internet.
To achieve this task, the Internet Society created the Pulse Internet Resilience Index (IRI). This document outlines the approach used to build the Index, the selection of indicators and the underlying data sources, the weighting scheme, and the aggregation and imputation methods used.
The Four Pillars of a Resilient Internet Ecosystem
To grasp the multi-faceted nature of the Internet, the Index is built on four main pillars, which together contribute to the smooth operation of the Internet. The pillars are:
- Infrastructure: The existence and availability of physical infrastructure that provides Internet connectivity.
- Performance: The ability of the network to provide end-users with seamless and reliable access to Internet services.
- Security: The ability of the network to resist intentional or unintentional disruptions through the adoption of security technologies and best practices.
- Market Readiness: the ability of the market to self-regulate and provide affordable services to end-users as part of a diverse and competitive market.
The Internet Society Pulse IRI is built using existing best practices according to the Handbook on Constructing Composite Indicators of the European Commission Joint Research Centre and the OECD. The Pulse IRI adopts a similar methodology to other extant indices such as the GSMA Mobile Connectivity Index, the Facebook/EIU Inclusive Internet Index and the Web Foundation Web Index.
Data Sourcing
Selecting Indicators
Building a robust composite indicator requires careful selection of the underlying indicators. To date, there are no direct and readily available metrics that provide information about the Internet resilience of a network or a country. In the Internet Society Pulse IRI framework, the indicators selected are reflective of a specific aspect of resilience that needs to be quantified. The OECD/JRC handbook provides some guidance on the main characteristics to consider when selecting the indicators. In essence, they should be accurate, timely, and should cover as many countries as possible. Additionally, the Internet Society Pulse IRI relies exclusively on quantitative indicators as opposed to qualitative ones such as perception of Service Quality. This is to ensure that there is an objective set of metrics that can be used to make comparisons between countries.
Selection Criteria
The following criteria were used when selecting the indicators:
- Relevance: The indicator should work towards showing an increase or decrease in the resilience of the Internet in a selected country.
- Accuracy: The indicator should correctly estimate or describe the quantities or characteristics they are designed to measure.
- Coverage: The data should cover as many countries as possible, as the Index is intended to be a global index.
- Freshness: Any dataset should be at most two years old. Some datasets such as performance or network coverage should be recent. Some other datasets such as EGDI do not change much from one year to the next, so it is acceptable to use these datasets even when a year or two old.
- Continuity: To objectively compare the index over the years, it is important to work with a stable list of indicators, which will provide data consistently over time.
Types of Indicators
There are three main types of indicators that have been used to calculate the Internet Society Pulse IRI:
- Direct indicator: A direct indicator is a direct measure of an aspect of resilience e.g., percentage of HTTPS adoption, latency, bandwidth, etc. They have a specific unit of measurement, and the raw value can be on different scales depending on what is being measured.
- Composite indicator: A composite indicator provides a score, which itself has been derived from multiple other variables. Examples are the MANRS score, EGDI index, etc. The scale of a composite indicator is usually between 0 and 100.
- Proxy indicator: A proxy is used where it is difficult to find a specific metric to measure an aspect of resilience. Proxies can be either direct or composite indicators. For example, the IRI uses “Number of IXPs” and “Number of datacenters” as proxy indicators for the robustness of the local infrastructure.
Orientation of Indicators
An indicator can either be positive or negative. In the Internet Society Pulse IRI framework, both positive and negative indicators are used either individually or in combination with other indicators to characterise overall levels of resilience. An example of a positive indicator is "Number of secure Internet servers" as the higher the number the more secure the network will be. Conversely, "% of spam infections" is a negative indicator, as the higher the percentage, the less secure the underlying networks are.
Details of Some Indicators
Network Performance
Network performance data relating to bandwidth, latency and jitter is collected from the monthly Ookla Speedtest Global Index. It contains measurements about fixed and mobile network performance around the world. The median download, upload, latency and jitter values are calculated by country.
Upstream Redundancy
The Upstream Redundancy is the average number of IPv4 upstream providers by active Autonomous Systems (ASes) in the country. The higher the number of upstream providers per AS, the more resilient the overall ecosystem is. The CAIDA AS-Relationship dataset is used to infer the provider to customer relationship.
Peering Efficiency
The Peering Efficiency score of a country is calculated by taking the number of local networks peering at IXPs in that country and dividing it by the number of local and active (seen on the global routing table) networks in that country. PeeringDB provides data about IXP peers and RIPEstat provides data about active networks.
Where:
Market Concentration
The Internet Society Pulse IRI uses the Herfindahl-Hirschman Index (HHI) to calculate the market concentration score. APNIC ASPOP statistics provide market share information by AS and by country. We aggregate this data by organisation using as2org+. The HHI has a range between 0 and 10,000 where 0 means no concentration (a competitive market) and 10,000 means only one ASN is present i.e., with 100% market share.
Where:
Upstream Provider Diversity
Diversity of upstream providers is an important element to measure as it indicates the extent to which the relationships of a given network are concentrated on a single network or group of networks. At a country-level, there are specific network operators providing international access and the more diverse the number of upstream Internet providers, the more resilient the country is in terms of network dependency.
The notion of network dependency can be proxied using AS Hegemony which is a score given to a network to quantify its centrality as observed by BGP monitors. AS hegemony ranges between 0 and 1 and can be interpreted as the average fraction of paths crossing a node. The higher the AS Hegemony score, the higher the dependency on that specific network.
Each network in a country has an AS Hegemony score based on how central it is for other networks in the same country. To calculate the diversity of the upstream provider distribution at a country-level, we use the HHI again. In a perfectly diverse scenario (HHI = 0), all networks would have the same AS Hegemony score. A high HHI value means that a small number of providers are dominant in the market for upstream Internet connectivity.
List of Indicators
Table 1 shows the list of indicators, the unit of measure and the source of the information.
| Indicator | Description | Unit | Source |
|---|---|---|---|
| Network Coverage | Mobile network coverage includes 2G/3G/4G with a composite score provided by the GSMA | Score (0 - 100) | GSMA |
| Spectrum Allocation | Spectrum allocation (composite score) | Score (0 - 100) | GSMA |
| Number of IXPs | Number of IXPs per city where city has population > 300,000 for countries with population of <=20,000,000 and city has population > 1,000,000 otherwise. | # of IXPs per city | PeeringDB |
| Datacenters | Number of datacenters | # of datacenter per 10 million population | PeeringDB |
| Mobile / Fixed Latency | Median latency observed to the nearest Ookla server | ms | Ookla |
| Mobile / Fixed Jitter | Median jiter observed to the nearest Ookla server | ms | Ookla |
| Mobile / Fixed Upload Speed | Median upload throughput measured to the nearest Ookla server | Mbps | Ookla |
| Mobile / Fixed Download Speed | Median download throughput measured to the nearest Ookla server | Mbps | Ookla |
| IPv6 | IPv6 enabled end users | % of IPv6 adoption | Akamai, Facebook, Google, APNIC |
| HTTPS | Pageloads using HTTPS | % of page loads using HTTPS | Mozilla |
| DNSSEC Validation | Users validating DNSSEC | % of users validating DNSSEC | APNIC |
| DNSSEC Adoption | Is the ccTLD DNSSEC signed? | True or False | DNS |
| MANRS Readiness | MANRS score (filtering, global coordination, IRR, RPKI) | Score (0 - 100) | MANRS Observatory |
| Upstream Redundancy | Average number of upstream IPv4 providers for a countries routed ASNs | Score (0 - 100) | CAIDA, NRO, RIPEstat |
| Secure Internet Servers | Number of secure Internet servers detected on the country's networks | # of secure servers per 1000 population | World Bank |
| Global Cybersecurity Index | Global Cybersecurity Index (Composite score) | Score (0 - 100) | ITU |
| DDoS Potential | Potential DDoS threat a country represents | Percentage | Cybergreen |
| Affordability | Mobile data and voice low-consumption basket. The basket is based on a monthly usage of a minimum of 70 voice minutes, 20 SMSs and 500 MB of data using at least 3G technology. | % of GNI per capita | ITU DataHub |
| Market Concentration | Herfindahl-Hirschman Index (HHI) calculates the market concentration based on market share information per network | Score (0 - 10000) | APNIC, PeeringDB, CAIDA |
| Upstream Provider Diversity | Herfindahl-Hirschman Index (HHI) calculated over the marketshare of transit networks with marketshare greater than 1% | Score (0 - 10000) | IIJ |
| Peering Efficiency | Ratio of networks peering at IXPs vs routed ASes in a country | Percentage | PeeringDB, RIPEstat |
| Domain Count | Domains registered by ccTLD | # of domains per ccTLD per 1000 population | DomainTools |
| EGDI | E-Government Development Index | Index (0 - 100) | UN |
Data Processing
Raw data comes in different forms and shapes and usually comes with several artifacts - some datasets are normally distributed, while some others are skewed. Before running any calculation or aggregation we need to impute for missing data and identify and handle outliers.
Missing Data
The following techniques have been used to impute missing data:
| Indicator | Technique | Details |
|---|---|---|
| Affordability | Substitution | We replace missing values with data from adjacent years |
| Fixed / Mobile Internet Performance | Substitution | We substitute mobile data for fixed data and vice-versa where values are otherwise unavailable |
| Maket Concentration | Backward fill | Initial gaps in data are filled with first available datapoints |
| Fixed / Mobile Internet Performance, HTTPS Adoption, Market Concentration, Secure Internet Servers | Forward fill | Gaps in data are filled with most recent earlier datapoints |
| IPv6 | Substitution | We impute a value of 0 where datapoints are otherwise unavailable |
| Spectrum Allocation, Network Coverage | Substitution | Replacement by data from a country from the same region with similar GDP per capita |
Re-scaling and Treating Outliers
The scales used by the indicators are also different e.g., latency can range between 0 – 500ms, while domain count for a ccTLD can range between 0 – 2,000,000. It is important to scale the data so that indicators are comparable to one another, and to avoid the issue of the size of the country (i.e., larger countries in terms of population or GDP tend to have more networks, IXPs, datacenters, etc.).
On the other hand, outliers have the tendency to skew the data and can therefore have an impact on the overall score calculation, especially because Internet Society Pulse IRI uses the min-max normalization method to scale the data (see section on Min-Max Normalization below). If an indicator has a very high or very low value, this will be reflected in the min-max calculation.
The following transformations have been applied to the listed indicators as part of the framework:
- Denomination by population size: Number of datacenters, Number of domains
- Denomination by number of cities: Number of IXPs
- Log transformation*: Secure Internet Servers, Fixed/Mobile Internet performance
* A logarithmic transformation is useful to treat skewed datasets and to discard extreme values. Not only does it scale the data, but it has the advantage of handling outliers in the dataset. Log transformation preserves the differences between the values.
After scaling and transforming the above indicators, we run a check on the skewness and kurtosis values of the remaining indicators. For those having a skewness > 2 or kurtosis > 3.5 (general thresholds for outlier detection), the IRI makes use of the IQR (Interquartile Range: Q3 - Q1) method to trim down outliers. The following rules are applied:
- Any value greater than Q3 + 1.5*IQR is replaced by Q3 + 1.5*IQR
- Any value less than Q1 – 1.5*IQR is replaced by Q1 – 1.5*IQR
Min-Max Normalization
The next step, after cleaning and transforming the data, is normalization. Normalization is important because indicators are collected using different units of measurement (percentage, ms, Mbps, count, etc.). It is therefore important to rebase them to a common unit such as a 0 - 100 scale, where 100 usually refers to the strongest and 0 to the weakest value.
The method chosen was the min-max normalization which is a common technique used by multiple known indices and as opposed to other techniques such as ranking and categorical scales, min-max keeps the interval between the countries consistent.
Below are the formula Internet Society Pulse IRI uses to calculate the value of an indicator depending on whether it is positive or negative:
Positive indicators contribute towards increasing an index, negative indicators contribute to a decrease in the score, which is why we take the delta:
We chose not to use the z-score standardization technique (this technique standardizes around the mean value and ranges between 0 and 1) as not all the indicators followed a normal distribution.
Finally, the IRI only includes countries for which we have data (after imputation etc.) for all indicators and for every quarter since 2019 Q1.
Weighting and Aggregation
Assigning Weights
There are two main ways to aggregate the normalized indicators into a final score using:
- An ad-hoc weighting scheme.
- Statistical (optimization) techniques.
The Internet Society Pulse IRI uses a weighting scheme as it is the simpler technique of the two and relies on input that the Internet Society gathered through survey and discussions with subject matter experts.
During the weighting process, the importance of the indicator was also considered using a lifecycle approach. For example, for the Performance pillar, the following weights were assigned to the underlying dimensions: Fixed networks (40%) and Mobile networks (60%). Higher importance was given to mobile networks as they are more widely relied upon for Internet access from a global perspective.
In the Internet Society Pulse IRI framework, the indicators are grouped into different dimensions, and the dimensions into pillars, which provide their own quantitative measures of a specific aspect of Internet resilience. Below is a table showing the indicators, dimensions and pillars and their associated weights, used for the calculation of the Internet Society Pulse IRI.
The weights are revisited on an annual basis.
| Pillar | Weight (%) | Dimension | Weight (%) | Indicator | Weight (%) |
|---|---|---|---|---|---|
| Infrastructure | 25 | Mobile connectivity | 50 | Network Coverage | 70 |
| Spectrum Allocation | 30 | ||||
| Enabling infrastructure | 50 | Number of IXPs | 50 | ||
| Datacenters | 50 | ||||
| Performance | 25 | Fixed networks | 40 | Latency | 20 |
| Upload | 30 | ||||
| Download | 30 | ||||
| Jitter | 20 | ||||
| Mobile networks | 60 | Latency | 20 | ||
| Upload | 30 | ||||
| Download | 30 | ||||
| Jitter | 20 | ||||
| Enabling technologies and security | 25 | Enabling technologies | 20 | IPv6 | 30 |
| HTTPS | 70 | ||||
| DNS ecosystem | 30 | DNSSEC Validation | 50 | ||
| DNSSEC Adoption | 50 | ||||
| Routing hygiene | 30 | MANRS Readiness | 50 | ||
| Upstream Redundancy | 50 | ||||
| Security threat | 20 | Secure Internet Servers | 30 | ||
| Global Cybersecurity Index | 40 | ||||
| DDoS Potential | 30 | ||||
| Local ecosystem & Market readiness | 25 | Market structure | 50 | Affordability | 40 |
| Market concentration | 30 | ||||
| Upstream provider diversity | 30 | ||||
| Traffic localization | 50 | Peering efficiency | 40 | ||
| Domain count | 30 | ||||
| EGDI | 30 |
Aggregation
The Internet Society Pulse IRI uses a weighted sum formula at each level (indicator, dimension, and pillar) to aggregate the data into a composite score. The following formula was used:
Where:
And where:
In simple terms, the final index 𝐼𝑅𝐼 of country "c" is the sum of the weighted pillars "P". A pillar is the weighted sum of the underlying dimensions "D" and a dimension is the weighted sum of the indicators "I", all of country "c".
Feedback
For any questions, comments, and feedback on the Internet Society Pulse IRI, please contact the Internet Society Pulse team ([email protected]).
Acknowledgements
The Internet Society would like to thank the following contributors for their valuable input to the conception of the Internet Society Pulse Internet Resilience Index (IRI). Amreesh Phokeer (Internet Society), Kevin Chege (Internet Society), Assane Gueye (Carnegie Mellon University-Africa), Josiah Chavula (University of Cape Town), and Ahmed Elmokashfi (Simula Research Lab).
