O n an average, consumers in the US are using four devices each day. In all likelihood, they will not identify themselves (login) from most of the devices. Every activity across these devices, leaves behind crucial data about the customer which if stitched correctly can become extremely valuable with time.
‘Stitched’ is the key word here because your analytics tool is counting them as unique users. Check your Google Analytics for the ‘New vs Returning’ report. It will most likely show an unusual number of new visitors (>80%). This is because your customers are using multiple browsers across multiple devices and your web analytics is counting all visits from each browser as a unique person.
To add to the complexity, businesses are using multiple tools that store some form of customer information. Systems like CRM, Email, Ecommerce, Point-of-Sale and Social Media Tools store different customer attributes like contact details, transactions, messages, sessions and ad interactions. Businesses are left with isolated data sets and are paralyzed when it comes to connecting these data silos.
This can be a big challenge in maintaining up-to-date accurate customer profiles and behavior data. Businesses rely on accurate customer profiles to deliver relevant messaging and experience.
Hence the need for a technology that can resolve customer’s identities across devices and databases in one profile. This unified profile is also called as the ‘single view of customer’.
An Identity Graph (ID Graph) is a database that stores all identifiers that correlate with individual customers. These identifiers could be anything from usernames to email, phone, cookies and even offline identifiers like loyalty card numbers.
You could have the same customer in your eCommerce software, CRM, email marketing tool and ad platform. An ID Graph will process the data from all tools and stitch it in one profile.
The process of matching and linking customer records from disparate sources is called identity resolution. While it makes sense to link customer records lying in different silos, it is also important to link households and devices so you can fine tune targeting as required.
It is common for members of a family to use the same account. Take Amazon Prime, Netflix and Doordash for example. Many of these accounts are shared between family members. Businesses should treat user accounts as households. The name of the person on the account is just the most prominent member of the household.
For identifying users in a household, businesses can use device clustering; which is the process of grouping similar devices based on IP sharing history and browsing patterns.
The matching algorithms used to stitch the profiles together are of two types; deterministic and probabilistic.
Deterministic matching is when you match the profiles with 100% certainty using identifiers like hashed email, phone or logged in username.
Probabilistic matching is not 100% certain and relies on identifiers like IP address, Device Type, Browsers and OS.
One of the techniques in probabilistic matching is to track the IP addresses of devices during different times of the day. For example if you detect three devices connecting from the same IP during day time and all three connecting from a different IP address in the evening, it’s probable that these devices belong to the same individual. The hypothesis here is that the individual is carrying the devices with him and connecting to different Wi-Fi networks throughout the day. The match is not 100% but you can achieve a higher accuracy by triangulating with different algorithms.
For example if you detect three devices connecting from the same IP during day time and all three connecting from a different IP address in the evening, it’s probable that these devices belong to the same individual.
Amit R G
While there is no one technology solution to achieve Identity Resolution. The most essential component is the database responsible for maintaining billions of users and their relationships. A family of database specially designed for this application are called graph databases. These databases are extremely powerful and made specifically for the purpose of managing connections between billions of identities with millisecond latencies.
Not surprisingly, every major tech company uses them. Facebook uses graph database to determine connections like ‘friends of friends’ and generate timeline feeds. Amazon uses graph database to make product recommendations by seeing what others bought with the product you are looking at. Richpanel Customer Data Platform uses Artificial Intelligence to predict links between customers and devices and store these relations in a graph database.
Identity graphs are primarily responsible for stitching customer identities and creating a ‘single view of customer’ which is the most accurate, up-to-date snapshot of customer attributes and behaviors. Once a business achieves this, they can use it for a number of applications including-