Necessary cautions when blocking fraud in real-time #foolsnomore
As an attribution provider, both marketers and media sources trust us to measure and report their performance.
While marketers would like to block 100% of ad fraud in real-time, there is a risk in overzealous blocking as well – damaging publisher relationships, ad network revenues, and data accuracy.
Today, we will take a deep dive into how we identify and validate every fraud signature, delivering optimal coverage without compromising your data accuracy.
Most of our R&D team operates on an agile release process, testing and releasing product updates in standard sprints.
In parallel, our fraud solution team, Protect360, operates like an antivirus or cyber-security solution, rather than your standard SaaS product team – the only difference is that they are expected to take action against fraud in near real-time, blocking emerging fraud sources without compromising data integrity.
This requires a far more agile and vigorous update cycle.
By transparently blocking mobile ad fraud in real-time, marketers save both time and money.
However, behind the scenes, maintaining real-time protection against bots, click flooding, install hijacking, behavioral anomalies, and device farms is challenging.
Blocking these threats requires the following:
- A massive data set for analysis
- Supervised and unsupervised machine learning that can detect new and emerging anomalies and fraud patterns
- Extensive experience in the mobile advertising, app marketing, and fraud industries
Not every anomaly is fraud
Given the fragmented nature of mobile ecosystem, there are dozens of points of failure, any one of which could set off false fraud alarms.
Unsupervised machine learning is great for detecting anomalies, but treating anomalies as fraud would be dangerous and misleading. Something as basic as a server delay or an external API bug can result in abnormal activity, even though no fraud has occurred.
Knowing when and what to block without compromising data quality and integrity is very difficult.
Every new fraud signature suggested by our automated system undergoes extensive testing and validation by our data scientists. In order to validate that an anomaly was the result of fraud and not a technical error or edge case, our data scientists must consider dozens of potential points of failure, comparing data across thousands of campaigns and advertisers.
Given the speed and agility of today’s mobile fraudsters, everything from our SiteID blacklists to our click flooding protection and bot signature databases must always be up-to-date, protecting against the latest and greatest threats as they emerge.
The data to map the mobile fraud genome
Protect360, AppsFlyer’s fraud protection solution, is powered by our unique mobile engagement database.
We process over 1,000,000,000,000,000 (1 trillion) mobile events every month across over 5,700,000,000 (5.7 billion) devices. These aren’t just big numbers, they are key to our ability to find and block fraud.
Consider the following analogy:
One hundred years ago, physicians diagnosed diseases based on their symptoms. As our understanding of science and medicine improved, the industry began conducting clinical trials that adhered to scientific method, improving results. Thanks to advances, including the mapping of the human genome, genetic testing can now accurately identify which specific diseases and genetic mutations are best treated with specific drugs.
In the business world, the only way to accurately identify the root causes of fraud and remove the bad actors without damaging the broader ecosystem is through big data.
Through carefully calibrating our machine learning, we are better able to find additional variables for each fraud signature allowing us to cut off even the most advanced fraud at its root.
Speed vs. accuracy
Finding the right balance between speed and accuracy is never easy.
Compromising accuracy for speed is short-sighted and leaves marketers exposed.
Our solution has been to distribute the workload, investing heavily in global fraud research, innovation, and collaboration.
Over the last six months, we onboarded additional mobile fraud-focussed data scientists, and founded an internal global task force to find and share potential fraud. Regional fraud leads and CSMs regularly share their new anomalies and advertiser challenges, collaborating with colleagues across 20 global offices.
Going the extra mile
A few team members have started exploring outside-of-the-box, joining gray hat and black hat cybersecurity forums and dark web meetups, to learn how fraudsters avoid blacklists, purchase DeviceID lists, operate and maintain botnets, and more.
We even met with a Russian broker who had visited an Asian device farm and interviewed a US college student running his own device farm to cover his tuition.
Meeting with partners and networks around the world are also helpful in terms of sharing insights and learning from our shared experience.
With dozens of team members collaborating on fraud challenges and potential solutions, the insights poured in.
Though this investment in resources across Product, R&D, CSM, and Support teams has been significant, I am proud to say that nobody has ever questioned the amount of time and effort we put into mobile fraud research, or any of our measurement products.
An investment in delivering better data accuracy for our clients and partners is an investment in our future as a measurement provider.
In summary, blocking fraudulent traffic in real-time saves time and money while improving your data accuracy.
However, this strategy demands extreme care and precision.
Blocking fraud in real-time requires a massive amount of fresh data, as well as deep expertise in both machine learning and the mobile ecosystem.