IAB Europe can’t audit what 1000+ companies that use its TCF system do with our personal data

Thanks to Dr Panos Papadopoulos, Dr Krzysztof Franaszek, Zach Edwards, Dr Augustine Fou, Pete Snyder for their insights. Any mistakes are my own. 


20 January 2022

This note describes why IAB Europe’s new “vendor compliance programme” is unable to establish transparency and control for the TCF.

Summary 

Despite IAB Europe's claims, there is no way to audit what happens to personal data after it has been broadcast to thousands of companies, hundreds of billions of times a day. IAB Europe's new "Vendor Compliance Programme" cannot establish transparency and control for the TCF. RTB and the TCF remain a data protection free zone. 

The table below summarises why IAB Europe cannot audit what companies who use its TCF system do with personal data.

Background 

The "Real-Time Bidding" (RTB) online advertising auction system decides what advertisements appear in most ad slots on the Internet. There are hundreds of billions of RTB auctions every day.[1] Each RTB auction broadcasts intimate personal data about us as we use websites and apps.[2] This includes very private aspects of your life.

The GDPR prohibits the processing of personal data unless it is kept secure.[3] However, there is no security in the RTB system, as I have warned for the last half decade.[4]

Shortly before the GDPR went in to effect in mid 2018, a tracking industry trade body called “IAB Europe” introduced a new system of misleading consent pop-ups as a gesture toward compliance. This now plagues Europeans every day on almost all websites and apps viewed in Europe.

IAB Europe calls its “consent” system the “Transparency & Consent Framework” (TCF). The TCF purports to give people control over how their data are used by 1,058 technology companies[5] in the online tracking and data industry. But in fact, the inherent insecurity of RTB means that it does not matter what people click.

There are no technical controls in place to limit what companies do with the personal data they get from RTB broadcasts. Rather than control where personal data goes, and what happens to it, the TCF is an uncontrolled honour system. According to IAB TechLab’s documentation, “there is no technical way to limit the way data is used after the data is received”.[6] In other words, RTB and the TCF are a data free-for-all.

Until late 2021 IAB Europe denied responsibility for data protection in its TCF system.[7] However, a group of complainants[8] coordinated by ICCL has taken action, and prompted regulatory intervention by the Belgian Data Protection Authority, which is IAB Europe’s lead GDPR enforcer.

In response to this pressure, IAB Europe now claims that it will attempt to monitor for the first time whether companies honour or ignore TCF requests about how data is used.[9] IAB Europe calls this the “TCF Vendor Compliance Programme”. As we show below, this is technically impossible.

IAB Europe’s TCF Vendor Compliance Programme appears to operate as follows:

“technical auditing of Vendors on Publishers properties by accessing and crawling websites implementing a TCF CMP and, where the vendor is integrated, to scan tags and analyse URL, headers, postdata, and cookies of https requests”.[10]

This “crawling” approach examines what happens on an end-user device. 

IAB Europe has not said that it will audit RTB broadcasts of personal data, despite this being the primary security concern. One reason for this absence may be that such an audit is technically impossible.

Technical impossibility of auditing

A. The majority of RTB data traffic is impossible to see or audit

A crawler on an end-user device cannot see what happens between companies servers. It is impossible for IAB Europe to independently monitor the movement of RTB data behind the scenes between companies’ servers. It can not observe what is sent in the bid request, or what companies it was sent to, or who those companies then passed that data on to, and what each company did with it. This “server-side” problem is insoluble, and is the consequence of RTB’s inherent insecurity. 

There is no way to control or provide transparency of how data are used after they are broadcast in the RTB system, because of four factors.

First, ad exchanges broadcast personal data to a very large number of companies (called Demand Side Platforms (DSPs)) that represent potential advertisers, in order to solicit bids for the opportunity to show their ad to the particular person viewing the website or app.

This broadcast happens behind the scenes, between companies’ servers, where IAB Europe's crawler cannot observe it.

RTB broadcasts can be very broad indeed. Microsoft’s advertising exchange (called Xandr) claims the right to broadcast RTB data to 1,647 other companies.[11] Google’s “Authorized Buyers” advertising exchange claims the right to broadcast RTB data to 1,057 other companies.[12] There are many such ad exchanges, and this scale of data sharing is not unusual. According to IAB Europe’s documentation, “thousands” of companies may receive the data.[13]

Second, an RTB auction for an individual advertising slot on a website is often not just one single auction, but an auction of auctions in which several ad exchanges compete to find the best bid. Thus, for one single advertising slot shown on a single web page, several ad exchanges often each broadcast personal data about the person viewing the web page to an even larger number of other companies.

This means that data protection depends on whether hundreds or thousands of companies can be trusted to honour a TCF request, every time there is an RTB auction. The TCF has no way of verifying whether they do so. 

Third, hundreds of billions of auctions happen every day, each involving a broadcast of data to many companies. As a result, even very small companies receive very large volumes of sensitive data.[14]

Hundreds of billions of RTB broadcasts. Every day.

Google

Unknown (active on 9.8 million websites)[15]

May send data to 1,042 companies[16]

AT&T (Xandr)

131 billion data broadcasts, daily

May send data to 1,647 companies[17]

Index Exchange

120 billion data broadcasts, daily[18]

Unknown

Pubmatic

100 billion data broadcasts, daily[19]

Unknown

OpenX

100 billion data broadcasts, daily[20]

Unknown

Verizon Media

? (600 billion ad requests, daily)

Unknown

Smaato

? (60 billion ad requests, daily)

May send data to 1,044 companies

Facebook

Unknown

Unknown

Amazon

Unknown

Unknown

Nor can IAB Europe audit related server-to-server operations, such as ‘clean room’ and other sharing of tracking profiles about large numbers of people. 

 

B. Limits to observable data on end-user device

IAB Europe can only observe what happens on the end-user devices that its crawler controls. However, there are problems here, too. In theory, each of the items identified by IAB Europe is visible to an end-user device that loads a website, and is typically trivial to monitor. However, all, except for cookies, can be obfuscated or encrypted to frustrate auditors.

The following problems apply, and are likely to apply more acutely over time if it becomes necessary for companies to adapt to evade auditing:

  1. What is visible on an end-user device can be encrypted, and obfuscated. Network calls may point to seemingly benign end points, but actually be routed surreptitiously in a way that obscures who is involved and what information they are receiving.[21]
  2. “Header bidding”, which might be assumed to be observable to IAB Europe's crawler because it inspects HTTP headers, is often hidden. This is because after the initial network request is sent using the header, the rest of the data processing occurs between companies servers.[22]
  3. When companies under audit detect the crawler they can refrain from treating that particular end-user device or set of devices the way they treat others. 
  4. IAB Europe’s initial request for proposals did not refer to many of the tracking technologies that should be included.

In addition, IAB Europe says it will attempt this audit for “top websites in key markets”.[23] This is a fraction of what would be required if such an audit were possible. However, since the audit is entirely impossible, the number of websites is moot.

Also moot is the otherwise troubling fact that no mobile apps appear to be included.

Conclusion

The lack of transparency and control in the TCF is therefore unchanged by IAB Europe's new “TCF Vendor Compliance Programme”. It remains impossible for a person to know what companies actually receive their data, or what will do with their personal data, or for a person enforce their rights under the GDPR over that data.

Notes

[1] See table “Number of bid requests per day”, in later section.

[2] Data sent can include what you are reading or watching or listening to, inferences about your sexual preferences, religious faith, ethnicity, health conditions, your political views, and where you physically are - sometimes right up to your GPS coordinates. RTB data also includes ID codes about us that help build intimate profiles about us by tying data together over time. See “Report from Dr Johnny Ryan – Behavioural advertising and personal data”, 12 September 2018, p. 2, 12-32 (URL: https://brave.com/static-assets/files/Behavioural-advertising-and-personal-data.pdf); see Lawsuit Ryan v IAB TechLab and others, at Hamburg District Court, 15 April 2021 (URL: https://www.iccl.ie/wp-content/uploads/2021/06/ENGLISH-TRANSLATION-MACHINE-TRANSLATED-COMPRESSED-Schriftsatz-an-das-Landgericht-Hamburg.pdf), p. 15-69.

[3] Article 5(1)f of the GDPR requires that personal data be

“processed in a manner that ensures appropriate security of the personal data, including protection against unauthorised or unlawful processing and against accidental loss, destruction or damage, using appropriate technical or organisational measures (‘integrity and confidentiality’).”

Article 32 elaborates on the requirements of this integrity and confidentiality principle.

[4] See for example "The 3 biggest challenges in GDPR for online media & advertising", PageFair Insider, 19 July 2017 (URL: (archived copy) https://assortedmaterials.com/2017/07/19/gdpr-3-deep-challenges/), “Report from Dr Johnny Ryan – Behavioural advertising and personal data”, 12 September 2018 (URL: https://brave.com/static-assets/files/Behavioural-advertising-and-personal-data.pdf); see Lawsuit Ryan v IAB TechLab and others, at Hamburg District Court, 15 April 2021 (URL: https://www.iccl.ie/wp-content/uploads/2021/06/ENGLISH-TRANSLATION-MACHINE-TRANSLATED-COMPRESSED-Schriftsatz-an-das-Landgericht-Hamburg.pdf), pp 71-89.

[5] The number as of 4 January 2021. “Vendor List TCF v2.0”, IAB Europe (URL: https://iabeurope.eu/vendor-list-tcf-v2-0/).

[6] Pubvendors.json v1.0: Transparency & Consent Framework, IAB Europe, 25 April 2018 (URL: https://github.com/InteractiveAdvertisingBureau/GDPR-Transparency-and-Consent-Framework/blob/master/pubvendors.json%20v1.0%20Draft%20for%20Public%20Comment.md#liability).

[7] “IAB Europe has not considered itself to be a data controller in the context of the TCF. Therefore, it has naturally not fulfilled certain obligations that accrue to data controllers under the Regulation.”
"Update On The Belgian Data Protection Authority’s Investigation Of IAB Europe", IAB Europe, 5 November 2021 (URL: https://iabeurope.eu/all-news/update-on-the-belgian-data-protection-authoritys-investigation-of-iab-europe/).

[8] The group of complainants includes: Panoptykon Foundation (Poland), Stichting Bits of Freedom (the Netherlands), Ligue des Droits Humains (Belgium), Dr Jef AusloosDr Pierre Dewitte, and ICCL’s Dr Johnny Ryan. The Belgian procedure builds on the campaign to end the vast data breach at the heart of online advertising that Dr Ryan initiated in 2018.

[9] "IAB Europe Launches New TCF Vendor Compliance Programme", IAB Europe, 26 August 2021 (URL: https://iabeurope.eu/blog/iab-europe-launches-new-tcf-vendor-compliance-programme/).

[10] “Tags” are code, HTML or JavaScript, on a website or app that runs on the user’s device. “URLs” are web addresses, and can include requests and information about the device making the request. “Headers” are parts of a website that tell the user’s device to take an action. They are sometimes used in “header bidding” to launch an RTB auction that has several sub-auctions.  “Postdata” is information that a user types in to a form on a website. “Cookies” are files that a user’s device stores at the request of a website. See "Request for Proposal – TCF Vendor Compliance Programme", IAB Europe, 8 February 2021 (URL: https://iabeurope.eu/blog/request-for-proposal-tcf-vendor-compliance-programme/).

[11] “Supply partners”, Xandr (URL (archive from 29 March 2021. – original now removed from public view by Xandr) http://www.iccl.ie/wp-content/uploads/2022/01/K13-24032021-service_policies_3-24-2021.pdf).

[12] "Ad technology providers", Google Ad Manager (URL: https://support.google.com/admanager/answer/9012903, last checked 7 January 2022).

[13] “Pubvendors.json v1.0: Transparency & Consent Framework”, IAB Europe, May 2018 (URL:  https://github.com/InteractiveAdvertisingBureau/GDPR-Transparency-and-Consent-Framework/blob/master/pubvendors.json%20v1.0%20Draft%20for%20Public%20Comment.md).

[14] For example, in 2018, the French data protection supervisory authority, the CNIL, revealed that one small DSP called Vectaury" collected had built a collection of data about 68.6 million individuals in just one year from RTB broadcasts. Vectaury had only €3.2 million turnover in the previous year. See “Décision n° MED 2018-042 du 30 octobre 2018 mettant en demeure la société X”, CNIL, 30 October 2018 (URL: https://www.legifrance.gouv.fr/cnil/id/CNILTEXT000037594451/).

[15] Google's Advertising Exchange is used on 13.5 million websites. Data from BuiltWith.com (URL: https://trends.builtwith.com/ads/DoubleClick.Net). Google itself has not published figures on daily auction volumes, but an analysis by the UK Competition Authority shows that it is by far the largest advertising exchange. See “Online platforms and digital advertising Market study final report”, UK Competition & Markets Authority, 1 June 2020 (URL: https://assets.publishing.service.gov.uk/media/5fa557668fa8f5788db46efc/Final_report_Digital_ALT_TEXT.pdf), p. 20. The CMA concludes that it has market power in OpenRTB, which means that its dominance is such that it can charge higher prices.

[16] "Ad technology providers", Google Ad Manager (URL: https://support.google.com/admanager/answer/9012903, last checked 7 January 2022).

[17] “Supply partners”, Xandr (URL (archive from 29 March 2021. – original now removed from public view by Xandr) http://www.iccl.ie/wp-content/uploads/2022/01/K13-24032021-service_policies_3-24-2021.pdf).

[18]  “IX Traffic Filter: Meeting 2020’s Business Challenges with Machine Learning”, Index Exchange, 6 August 2020 (URL: www.indexexchange.com/ix-traffic-filter-meeting-2020s-business-challenges-with-machine).

[19]  “Optimising data processing at scale”, PubMatic, 10 June 2020 (URL: https://pubmatic.com/blog/optimizing-data-processing-at-scale).

[20] “OpenX: Power the future of advertising with Google Cloud”, Google Cloud (URL: https://cloud.google.com/customers/openx).

[21] One method is CName cloaking, which operates as follows. The end point “example.publisher.com” is actually controlled by an advertising technology firm whose own domain is “tracking-company.com”.

[22] See for example “Server Side Header Bidding Explained”, Sovrn (URL: https://knowledge.sovrn.com/server-side-header-bidding-explained).

[23] "IAB Europe Launches New TCF Vendor Compliance Programme", IAB Europe, 26 August 2021 (URL: https://iabeurope.eu/blog/iab-europe-launches-new-tcf-vendor-compliance-programme/).