Google Is the Largest ‘Web Tracker’, but by Far Not the Only One

We performed a study about web tracking.Web tracking describes the very widespread practice of sites embedding so-called web trackers on their pages. These have various functions, such as enhancing the loading of images, or embedding extra services on a site. The very best recognized examples are the”Facebook buttons “and “Twitter buttons “discovered on many websites, but the most prevalent ones are without a doubt those from Google, which in truth runs a wide variety of such” tracking services”. There are nevertheless a lot more– what they all have in typical is that they permit the company that runs them to track exactly which websites have actually been gone to by each user.Web trackers have actually been well-know for rather a time, but previously no web-scale study

had actually been carried out to determine their level. Therefore, we performed a study about them. We evaluated 200 terabytes of information on 3.5 billion web pages.The key insights are:90%of all websites include trackers.Google tracks 24.8%of all domains on the web, globally.When taking into consideration that not all web websites get visited similarly often, Google’s reach is

  • even greater: We estimated that 50.7%of all gone to pages on the internet consist of trackers by Google.(Ironically, we approximated this by utilizing PageRank, a step initially related to Google itself.) The top 3 tracking systems released on the internet are all operated by Google, and utilize the domains google-analytics. com, google.com, and googleapis.com. The top three business that have trackers online are Google, Facebook, and Twitter, in that order.These big companies are by far not the only companies that track: There is a long tail of trackers on the web– 50% of the tracking services areembedded on less than ten thousand domains, while tracking services in the leading 1%of the distribution are incorporated into more than a million domains.Google, Twitter and Facebook are the dominant tracking business in nearly all countries in the world. Exceptions are Russia and China, in which regional business take the leading rank. These are Yandex and CNZZ, respectively. Even in Iran, Google is the most released tracker.Websites about topics that are particularly privacy-sensitive are less most likely to contain trackers than other websites, however still, most of such sites do include trackers. For instance, 60 %of all online forums and other sites about mental health, dependency, sexuality, and gender identity include trackers, compared with 90% overall.Many sites consist of more than one tracker. Several trackers are so typical that we were able to figure out clusters of trackers that are often used together by private websites– these permit us to instantly spot various types of trackers such as advertising trackers, counters and sharing widgets. (See the picture above.)Not all trackers have the explicit purpose of tracking individuals: Lots of types of systems carry out useful services in which tracking is an adverse effects. Examples are caching images, optimizing load times, boosting the functionality of a website, and so on. For a lot of these systems, web designers may not know that they enable tracking.Note: The research study was performed using information from 2012 by the Typical Crawl task. Due to that their crawling method has altered considering that then, more recent data in fact represents a smaller portion of the entire web.The research study was performed by my colleague Sebastian Schelter from the Database Systems and Info Management Group of the Technical University of Berlin, and myself.The post is released in the Journal of Web Science,

    and is available as open access: The complete dataset is readily available online on Sebastian’s website, and also through the KONECT task.