It’s time to come clean – Data Clean Rooms

By Einav Mor-Samuels
Data Clean Rooms square

If you’re a marketer, it’s unlikely you’ve managed to avoid a conversation in the past few months where “Data Clean Room“ was not brought up at least once, and usually in an excited yet slightly confused tone.

What is this strange, hygienic chamber of data everyone’s talking about? 

Some refer to data clean rooms as “the Switzerland of data”, and rightfully so, because it offers a neutral, safe space for 1st-party user data to be leveraged collaboratively. In a data clean room environment, two parties can securely share and analyze data with full control of how, where, and when that data can be used. 

In this way, brands are given access to much-needed data, but in a regulatory compliant space that doesn’t violate consumers’ privacy. While user level data goes into the data clean room, aggregated insights come out in a co-mingled audience group called a cohort. 

So, to get you well equipped for 2022, we’re going to take you on a journey through thick forests of unknowns and deep lakes of 1st-party data, in a series of three blogs entirely dedicated to the topic of data clean rooms.  

By the end of which you’re going to know all about what they are, how they work, why marketers need them, and how they’re going to dramatically affect our ability to measure campaigns in the years to come.

But before we do, let’s begin with the story that actually led us all to this point.

It’s evolution, baby

Data clean rooms evolution

Despite its resurgence in the past year, data clean rooms as an infrastructural concept have actually been around for a few years now. 

Google was not the first to coin the term, but it was the first company to commercialize a data clean room solution, launching its Ads Data Hub in 2017. The goal was to create a secure and private environment for enriching their 1st-party data (from CRMs, CDPs, event logs, etc.) with user level data contained within Google’s ecosystem, after which it could be leveraged for Google campaigns.

A mere month later, Facebook announced its own data clean room offering for the purpose of sharing data with its customers. A coincidence? Probably not. 

But it was 2018 that truly set off the starter pistol of the user privacy era, with legislation such as the GDPR and Apple’s Intelligent Tracking Prevention 2.0 becoming the new privacy sheriffs in town.

Following suit in 2019, Amazon launched a data clean room platform titled Amazon Marketing Cloud, the CCPA was brought into effect in early 2020, and in April 2020 – the entire mobile app ecosystem gasped as Apple dropped its opt-in mechanism bomb in iOS 14 – aka the ATT.

Amounting user privacy laws and stricter data privacy standards have transformed the way advertisers and brands can collect and share consumer data.

Facebook announced in October of 2021 that it will no longer send user level campaign data to advertisers, but to Mobile Measurement Partners (MMPs) only, with other networks expected to join the party soon.

Between Apple’s game-changing ATT framework, Facebook’s user level data decision, and the upcoming demise of Google’s 3rd-party cookies in 2023, the scale and breadth of data sharing is becoming increasingly limited, making campaign measurement and optimization more challenging than ever before.

So, brands are now scrambling to find new ways to gain meaningful marketing insights in a privacy-compliant way. 

Kicking off the data exchange alliance trend in 2019, Disney began collaborating with Target, Unilever joined forces with Facebook, Google and Twitter to create a cross-channel measurement mode, ITV entered a partnership with Infosum in 2020, and in 2021, TransUnion launched its data collaboration with BlockGraph.  

The binding element that enabled all these bountiful data collaborations that are only expected to increase? Why, Data Clean Rooms, of course.

What is a Data Clean Room anyway?

Data clean rooms allow marketers to harness the power of the combined data set while adhering to privacy regulations. Personally identifying information (PII) or attribution restricted data of individual users is not exposed to any of the involved contributors, which makes it impossible for them to single out users with unique identifiers.

PII and user level data are processed so that it can be made available for a variety of measurement purposes, producing anonymized data that can then be cross-referenced and combined with data from different sources. 

In most cases, the only outputs from the data clean room are aggregate level insights, e.g. users (plural!) who have performed action X should be offered Y. That being said, user level output can take place given the full consent of all involved parties.

The key ingredient that makes data clean rooms a highly credible platform is the fact that access, availability, and usage of data are agreed upon by all data clean room parties, while data governance is enforced by the trusted data clean room provider. 

This framework ensures that one party can’t access the other’s data, which upholds the ground rule stating that individual or user level data can’t be shared between different companies without consent.

Let’s say a brand wants to share insights with Target. To facilitate that, each party needs to place its user level data into a data clean room – to see what the other already knows about audiences they have in common, e.g. reach and frequency, audience overlap, cross platform planning and distribution, purchasing behavior, and demographics.

Data clean rooms can also be used as an intermediary tool for measuring campaign performance. Instead of guesstimating audience insights, brands can actually look under Amazon or Google’s 1st-party data hood, all while being completely privacy-abiding.

In return, advertisers can get an aggregated output without individual identifiers, including segmentation and look-alike audiences, which can then be shared with a publisher, a DSP, or an ad network to inform a campaign. Alternatively, if you’re a retailer with an ad network, for example, you will be able to leverage this output when buying ads.

Making sense of it all – How does a Data Clean Room work?

How does a data clean room work?

A data clean room operation involves four parts: 

1 – Data ingestion

In the very beginning, 1st-party data (from CRMs, site/app, attribution, etc.) or 2nd-party data from collaborating parties (i.e. brands, partners, ad networks, publishers) is funneled into the data clean room. 

2 – Connection and enrichment

Data sets are then matched at the user level, and are made to complement one another using tools such as 3rd-party data enrichment.

3 – Analytics

At this stage, the data is analyzed for: 

  • Intersections or overlaps
  • Measurement and attribution
  • Propensity scoring

4 – Marketing applications

At the very end of the data clean room journey, aggregated data outputs enable marketers to: 

  • Build more relevant audiences
  • Optimize their customer experience and A/B testing
  • Execute cross platform planning and attribution
  • Perform reach and frequency measurement
  • Run deeper campaign analysis
Data clean room architecture

Now that we’ve covered the how, what about how the data is actually matched? 

When working with a data clean room, identifiers such as email, address, name, or mobile ID are similar on both the advertiser and publisher side, which enables successful matching of both data sources.

If such identifiers do not exist, advanced tools such as machine learning and probabilistic modeling could be applied to enhance matching capabilities.

Why do marketers need a Data Clean Room?

Why do marketers need Data clean rooms?

First and foremost – rising scrutiny around data privacy. 

Driven by privacy regulations and walled garden privacy initiatives (more on that in a bit), it’s becoming increasingly complex for advertisers and publishers to collect, store, analyze, and share data.

Second reason would be lack of commercial trust between parties. As we all know well, handing over valuable 1st-party data outside of a data clean room is risky from both a legal and commercial perspective. 

Lastly, inefficient data synthesis processes, where data correlation across separate data sets requires heavy lifting by data scientists, which is a costly and time-consuming endeavor. 

Data Clean Rooms to the rescue!

When it comes to data privacy, all parties within a data clean room maintain full control over their data, which is usually fully encrypted throughout the process. A data clean room includes strict governance and permissions, where each party defines what and how their data is accessed and put to use.  

Another important aspect that addresses the challenges mentioned above is differential privacy, which makes it impossible to tie back a specific impression, click or activity to a specific user. 

Last but certainly not least, data clean rooms offer privacy-centric computing, querying, and aggregated reporting fit for purpose integrations so data sets can be stitched together. 

Types of Data Clean Rooms

We’ve discussed the business, technological and legal demands that led to the creation of data clean rooms. Now let’s have a high-level break down of the actual data clean room breeds that are out there:

Walled Gardens – Big Tech platforms

Data clean rooms: Walled gardens

This group consists of closed ecosystems where the tech provider has significant control over the hardware, applications, or content.

Walled gardens were introduced by Google, Amazon, and Facebook to safely commercialize their 1st-party data, and capture ad spend from rivals. 

The upside of opting for a walled garden is the ability to support 1st-party data set enrichment with event level data. Its downsides, however, include rigid architecture, lack of cross platform activation of data (i.e. multi-touch attribution), lack of intercompany data collaboration, and strict query functionality.

Multi platform or neutral players

This type of data clean rooms includes three sub-groups, each with their unique set of strengths and drawbacks:


These are primarily legacy businesses operating in adjacent industries like marketing applications or cloud data storage. Diversified providers offer organizations data collaboration mechanisms for gathering signals in a regulatory compliant way. 

Their limited access to walled garden data is balanced out by their architectural flexibility, and bespoke governance controls over type of data and level of analysis. 


These are your young, small-scale data clean room providers. While offering flexibility, their 1st-party data granularity is limited, they often rely on 3rd-party infrastructure for data ingestion, and also offer a narrow pool of downstream integration options.


Despite some limitations that could be imposed by SRNs, your mobile measurement partner can offer user-level and cross channel data granularity, real-time conversion data, best in class analytics built for mobile apps’ business logic, flexible integration options, and top-quality aggregated reporting.

To assess the best data clean room provider for you, be sure to factor in your main channel (mobile, app, or web), business size, marketing needs, data structure, and internal resources.

Data clean room types comparison
Assessing relative performance across the value chain

Where is the market heading?

Data clean room: Market future

1st-party data collection has already become a highly strategic mission, and this trajectory will continue to pick up speed in the years to come. Driven by this trend, the growing interest in privacy-preserving data collaboration beyond walled gardens has resulted in a proliferation of neutral data clean room providers. 

In fact, Gartner predicts that 80% of marketers with media budgets in excess of $1B will adopt data clean rooms by 2023. 

This is good news for our entire data-starving ecosystem, because the more diverse the options are, the easier it would be for businesses to adopt the most suitable data clean room platform for their unique needs. 

And the more businesses collaborate over regulated intermediary data grounds such as data clean rooms, the easier it would be for marketers to measure, attribute, and optimize their campaigns.

Key takeaways

  • Data clean room is a secure and highly protected environment that supports the matching of two or more data sets in a transparent environment, solving the technological and legal impediments to multi-party data collaboration.
  • What makes data clean rooms a highly credible platform is the fact that access, availability and usage of data are agreed upon by all data clean room parties, and data governance is enforced by the trusted data clean room provider. 
  • Types of data clean rooms include walled gardens provided by tech giants such as Google, Amazon, and Facebook, and multi platform or neutral players that offer more flexibility around architecture, data governance, and integration options. 
  • Each sub-group of data clean rooms solutions offers its own set of benefits and downsides, designed to meet varying business and data needs.
  • The growing hunger for 1st-party data for the sake of campaign measurement – is resulting in a proliferation of data clean room providers. The more businesses collaborate using data clean rooms, the easier it would be for marketers to monitor, attribute, and optimize their campaigns.

Einav Mor-Samuels

With extensive experience in digital marketing, Einav is a Content Writer at AppsFlyer. Over the course of the past 15 years, she has gained ample experience in the mobile marketing landscape, researching market trends, and offering tailored solutions to customers' digital problems. Einav fuels her content with data-driven insights, making even the most complex of topics accessible and clear.

Ready to start making good choices?