How does your location affect your online privacy?

Joe Bosso 5 Jan 2022

Each country and region has its own notion of online privacy and approach to structuring a privacy policy

Part of the ongoing collaboration between Diffbot and Avast includes the ongoing research into privacy issues for consumers. As part of this research, we’ve become interested in how an individual’s privacy could vary throughout the world based on the location from where they connect to the internet. We wanted to know if an individual's data was inherently more private if they resided in one country versus another. 

Our starting point to shed some light on this research question is the Tranco list of domains. This list ranks top sites by popularity and we analyzed the top one hundred thousand websites using Diffbot’s knowledge graph and combined it with some publicly available information about the domains. 

First, we studied the relation of a country having a free press and the privacy policies of domains in those countries. Factors such as whether or not the websites included a privacy policy at all and the Press Freedom Index were parsed by country and found that they were correlated (the correlation coefficient was 0.75). This makes sense, as a free press corps demands increased transparency from corporations and government.


Next, we looked at how the privacy policies themselves varied between countries and found that German sites had the longest privacy policies for any country. In general, European privacy policies tend to be longer than most other countries' policies. This is likely due to the increased transparency required by their data protection rules, like the GDPR. An interesting observation was that South Africa stood out among African countries, which may be a result of its privacy law, the POPI Act


Another factor that we looked at was the readability of English privacy policies. In other words, how easy the privacy policy is to read. We based that metric on the well-known Flesch reading ease formula and also considered the length of the sentences (by number of words) as well as the number of syllables per word. This makes sense, as shorter sentences with simpler words are, of course, easier to read.

We found that Turkmenistan has the easiest privacy policies to read, partly because their overall number of policies is small and the policies they create are short and easy to read. Contrarily, South Korea, Japan, and Thailand had the hardest policies to read. Plotting the readability scores on a world map looks like the one below, where higher scores indicate that the policies are easier to read.


In addition to the length and readability of privacy policies, we also analyzed whether the English language used within them was vague as opposed to clear and concise language, which is much easier to understand for most people. Vagueness is estimated as a rate of privacy policy segments containing vague words. Vague words were labeled by annotators in a previous study by a different research group. We found that European countries had less vague privacy policies than the US, which was expected, as the GDPR requires the use of clear language.


Countries with privacy specific legislation, like those in Europe, tend to result in greater privacy for their residents. Legislation often requires the presence of a privacy policy in the first place. Germany is the leader here, with many European countries also scoring well.


Based on our findings, we can conclude that the location of the website domain does have an impact on privacy policies. There is no general, global privacy policy, but each country and region will have their own notion of online privacy and approach how to structure a privacy policy.

This being said, you can't assume that what you learned from one privacy policy will also somehow apply to another country. One common insight is that privacy policies are hard to read overall. The majority of privacy policies require a college-level education to understand, which means the majority of people will have a difficult time to understand what they're agreeing to.

--> -->