Illustration of birds on a wire

Characterizing Internet Centralization vs Rationalization

Picture of Rumaisa Habib
Guest Author | Stanford University
Categories:
Twitter logo
LinkedIn logo
Facebook logo
December 11, 2025
In short
  • A new set of standardized metrics can help measure Internet centralization and regionalization.
  • Geopolitical, historical, and linguistic factors also shape which companies and countries are relied upon for web infrastructure.
  • The research community should consider using, adapting, and building upon these metrics to study dependence and reliance in other facets of the web.

Internet centralization has become an increasingly active topic of debate. Understanding who controls the Internet has major implications for privacy, innovation, and resiliency. 

While there has been ample discussion on the topic, a consolidated metric is needed to capture the distribution of dependency across organizations, facilitating in-depth analysis. Moreover, we emphasize the previously overlooked notion of regionalization–– geopolitical dependence on the Internet—in the discussion of centralization. 

In this post, we introduce metrics that we recently presented at SIGCOMM 2025, which enable us to form a comprehensive picture of web dependence. We also discuss some of our results and explain why they are important in shaping discussions about the Internet.

Centralization vs Regionalization

In our paper, we introduce the following metrics to define centralization as well as regionalization:

CentralizationWe introduce a hypothetical ‘decentralized’ scenario where every website accessed in a country has a unique provider. Centralization is thus calculated by comparing how different the distribution of provider usage is from the fully decentralized case. We use Wasserstein distance — also commonly known as Earth Mover’s Distance (EMD) — to measure centralization.
RegionalizationUsageThe sheer scale of a provider, in how many websites it provides for.
EndemicityThe concentration of usage in certain countries, or how specific the usage of a provider is. 
InsularityThe fraction of websites for which that layer is served by a provider based in the same country.

We propose that the research community uses, adapts, and builds upon these metrics to study dependence and reliance in other facets of the web.

Centralization

Consistent with prior work, we find that a large number of websites are hosted by a small fraction of providers. When we apply EMD to measure the degree of centralization, we find that in the most centralized case, 60% of websites in Thailand are served by a single provider, Cloudflare (Figure 1). 

Figure 1 — Cloudflare is the most popular provider in every country except Japan. While the most centralized countries overtly rely on Cloudflare, the least centralized countries tend to rely on a range of regional providers (represented by the red bars).

While the centralization of a country is strongly correlated with the use of two globally popular providers (Cloudflare and Amazon), we also find instances of regional providers (i.e., providers endemic to a region) making meaningful contributions to centralization. For example, 22% of websites in Bulgaria and Lithuania are hosted by SuperHosting.BG and UAB, respectively, despite neither of these organizations having a global presence, which is why they are absent in prior work. 

Our work highlights that regional providers are an essential component in the discussion of centralization

We additionally find that website certificates are highly centralized — seven Certificate Authorities (CAs), most of which are based in the U.S, account for 98% of websites. Centralization on a small number of CAs is not inherently bad for the security of the Internet since many small CAs struggle with operational requirements, and a moderate degree of centralization reduces the attack surface of the WebPKI. Regardless, one clear consequence is that the vast majority of countries are dependent on the U.S. infrastructure.

Regionalization

Centralization alone does not paint a complete picture of dependence. Stepping back from specific providers, we see patterns emerge between countries. 

Geopolitical, historical, and linguistic factors have shaped which countries are relied upon for web infrastructure. Indeed, while we observe a significant dependence on U.S. infrastructure across the board (due to the presence of large global providers in this region), we note that this has decreased in Russia following the Russian invasion of Ukraine in 2022. 

Other regional patterns also emerge when we examine historical contexts. For example, many post-Soviet states (excluding Ukraine) still rely on Russian providers, and a history of colonization can explain the existing reliance of African countries on French providers. 

Shared languages may also explain why people in a particular country commonly access websites hosted in another country. Potentially, this is why we see the significant usage of Iranian providers in Afghanistan and German providers in Austria. 

Greater Understanding Leads to Better Design

The geopolitical dynamics that we observe shed light on how we can better design for and evaluate networks globally. For instance, while major providers may roll out new protocols and optimizations to improve user experience, these benefits may not be felt uniformly. In many countries, these deployments affect only a modest fraction of the sites users frequent. Moreover, researchers studying web resilience would benefit from understanding how availability and performance could be impacted not only by a provider outage, but also by a geopolitical schism between two countries

You can read our full paper to learn more about our data collection and classification methodology, as well as additional insights, such as the centralization of Top-Level Domains.

Rumaisa Habib is a third-year PhD candidate at Stanford University, working under the supervision of Zakir Durumeric in the Empirical Security Research Group.

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of the Internet Society.