Differential privacy (DP)

Differential privacy is a system for publicly sharing information about a dataset by describing the characteristics of groups within the dataset, without compromising individual-level information within the dataset.

What is differential privacy?

Simply put, differential privacy takes data anonymization and aggregation a few steps further in terms of privacy, by making it practically impossible to distinguish from the output whether an individual has been included in the computation.  

The idea behind DP is to eliminate the possibility of inferring much about any single individual, which upholds today’s strict user privacy regulations, while enabling marketers to measure, optimize and make data-driven decisions accordingly. 

For example, DP is used by some government agencies to publish demographic information, while ensuring full confidentiality of survey responses. It enables a safe environment for collecting user-level data by controlling which data goes in, which goes out, and limits its visibility even for internal analysis.

Notable adoptions of DP in real-world applications include Google that began deploying differential privacy measures for sharing historical traffic statistics in 2015, Apple — which started using DP in iOS 10 to improve its Intelligent personal assistant technology in 2016, and LinkedIn — which utilized DP in 2020 for advertiser queries.

What are the benefits of leveraging differential privacy?

Differential privacy can solve problems that arise when sensitive data, publishers, and adversaries — are present.

A dataset is differentially private if an observer seeing its output cannot tell if a particular individual’s information has been used in the computation. Because of that, differentially-private algorithms are resilient to adaptive identification and reidentification attacks.

DP is also at the core of aggregated measurement and attribution, in addition to deterministic (where allowed) and non-deterministic methods, probabilistic modeling, and signals from Apple’s SKAdNetwork

How does differential privacy work?

When a sufficiently large number of individuals participate in a DP analysis, their statistically collected data is taken, the noise cancels itself out and the obtained average is close to the true average. 

Differential privacy works by adding carefully calculated statistical “noise” into a dataset, and it’s this “noise” that makes user-level identification impossible to obtain. 

Now, we now have data on average user behavior — all while keeping users’ privacy completely intact.

Key takeaways

  • Differential privacy is a system for publicly sharing information about a dataset by describing the characteristics of groups within the dataset, without compromising individual-level information within the dataset. 
  • The idea behind DP is to eliminate the possibility of inferring much about any single individual, which upholds today’s strict user privacy regulations, while enabling marketers to measure, optimize and make data-driven decisions accordingly. 
  • Differentially-private algorithms are resilient to adaptive identification and reidentification attacks, and are also at the core of aggregated measurement and attribution 

Get the latest marketing news and expert insights delivered to your inbox