A Recipe For Discrepancy
Looking up ‘Discrepancy’ in the dictionary yields the following result: A lack of compatibility or similarity between two or more facts. How can two or more facts be incompatible? They are facts. They are both true. That’s the exact problem with data discrepancies.
If you’ve been around the digital marketing business for a while, you’re probably balding. That’s because you’ve pulled your hair out trying to figure out why different platforms are displaying different numbers, particularly different conversions.Why are different #analytics platforms displaying different numbers? Click To Tweet
In order to help you avoid a similar fate, in the following blog post, I will do my best to help you understand the reasons behind discrepancies and what needs to be done to resolve and address them.
This is probably the easiest one to grasp. Different platforms use different timezones and if the data you’re looking at is sliced by hour or day, discrepancies are likely to occur. That’s because data collected by one platform would “leak” to a previous/following day/hour in the other. For example, let’s assume you’re looking at conversions on one system that are configured to record timestamps based on a GMT timezone on a specific day, while another system is set to a PST timezone. In this case, results that are collected between 12:00am-6:59am GMT will be marked as 17:00pm-11:59pm in PST – the previous day!
If you cannot align different platforms to use the same timezones, a good practice to minimize the discrepancy is to use a wider time frame (but not too wide) so the hourly difference between timezones becomes negligible – I recommend two weeks.
Different platforms use different terms. For example, Engagements are typically referred to as ad clicks or views/impressions. However, one platform could be counting engagements and another platform could be counting people who engaged with an ad (AKA reach).
Conversions is another good example. What is the exact definition of a conversion? In the app ecosystem we’re talking about an App Install. Whereas some platforms may count every app download as an install, others may register an install only when the downloaded app was launched for the first time.
Another example would be an in-app event such as an in-app purchase, a tutorial/level completion or any other KPI that you are measuring. It is critical to know how and when these KPIs are being measured by the different tools you’re using. For instance, with one platform you could be attributing an in-app purchase once it is submitted by a user. However, another platform only counts an in-app purchase when a receipt for the purchase is validated.
Therefore, it’s really important to align your KPI definitions or at the very least understand exactly what each one means. Once you do that, your numbers can be aligned with the red apples in one basket and the green apples in another.
Difference in attribution methods
As a new age marketer, you’re most likely busy with attributing digital action to other preceded actions so that you can properly optimize your campaigns. Unfortunately, not only do different services/platforms define actions differently, they may also attribute actions differently.
For instance, one platform could attribute based on the last engagement, while another platform based on the first engagement; or when one provider attributes in-app purchases based on the first time a user engaged with an ad even before he or she were acquired, while another provider to the app install event.
It’s important to remember that in the mobile marketing space there are companies that offer end-to-end or multiple solutions. They buy the media, measure the campaigns and optimize them. As such, their core business from which they earn most of their revenue involves buying and selling media, while measurement is secondary at best. In such a case, conflicts of interest may arise as they essentially measure their own success.
Cohorts / Apples to Apples
Cohorts by definition are users bundled together and treated as a group. As a marketer, your job is to look at cohorts and see how well they’ve performed based on a metric of your choice. The problem is that platforms could have different filters and grouping mechanisms so the odds of you creating the exact same cohort in two different platforms, and then applying the same exact metric are more or less the odds of you bumping into a unicorn (a real one) that was struck by lightning. Twice.
And that’s exactly the problem — by the time you create your cohorts and compare them, your data is so digested that instead of comparing apples to apples, you’re practically comparing cider to expired applesauce.
Keep calm and go raw
So what can you do to deal with discrepancies? The best way forward is to make your own apples. If you want to make a proper comparison, you must break down your aggregated/digested data into raw data points. Idealy, you should create a list of timestamped events with a unique key identifier that is common for the two (or more) platforms that you suspect to have a discrepancy. If there is a discrepancy, comparing the lists will enable you to identify which events were recorded on one platform but were missing from the other (the delta).
With this data at your disposal, you now have the ability to tackle the problem by taking samples of the data points and analyzing them. If your delta data points are missing from one system because of differences in mobile app attribution or other measurement methods, then you’ve found the solution to your problem.
If different measurement methods do not explain the absence of the delta data points, it would be safe to assume that the issue is technical in nature, leading you to investigate in that direction. You will be able use this raw data as evidence that events were measured/not measured and provide it to the technical teams of the respected parties for further investigation.
Here are a few suggested rules of thumb:
- Comparing different digested data tables is not a valid proof of discrepancy; your technical resources will not be able to do anything about it.
- Only raw data is comparable — but make sure you have enough of it and that it can be reproducible to rule out any random / edge case anomalies.
- Recommended tolerance for discrepancy is 10%.
Always look on the bright side
Discrepancies are natural and are actually a good thing. When you investigate a discrepancy you’ll either find a technical issue or you’ll gain more knowledge about your measurement tool — how it is measuring the data and what is its level of accuracy and dependability.
More importantly, you’ll have the perfect recipe to prevent any further hair loss…