This article is about finding and deleting xDB contact outliers automatically using Analytics Database Manager and PowerShell. Everyone talks about Sitecore SmartHub nowadays but huge number of clients are still with Sitecore XP and may face issues with xConnect performance.
Let's talk about xDB contacts and Interactions.
xDB Contacts
A contact represents an individual who interacts with or may potentially interact with your organization. Contacts are represented by the Sitecore.XConnect.Contact class, and are uniquely identified by ID (of type Guid) within the xDB. ref
Interactions
An interaction describes any point at which a contact interfaces with a brand, either online or offline. Examples of interactions include:
- Purchasing a something from a physical store
- Using an app
- Browsing a website
- A phone conversation
2022-05-27 05:05:30.910 -07:00 [Error] ["XdbContextLoggingPlugin"] XdbContext Batch Execution Exception
Sitecore.Xdb.Collection.Failures.DataProviderException: Execution Timeout Expired.
The timeout period elapsed prior to completion of the operation or the server is not responding. --->
System.Data.SqlClient.SqlException: Execution Timeout Expired.
The timeout period elapsed prior to completion of the operation or the server is not responding. --->
System.ComponentModel.Win32Exception: The wait operation timed out
If you have used Sitecore xConnect, you might have faced above error sometimes. Kelly Rusk has an excellent post on how excessive interactions can be prevented.
As per Sitecore support, contacts with >1000 interactions can be considered 'excessive'. so with this large number of interactions I suspect it was causing the SQL performance issues.
There is a module called Sitecore Analytics Database Manager (ADM) which allows performing various operations with collection database:
- Removing contact and interactions data.
- Viewing database statistics.
- Rebuild calculated facets.
- Remove all data for contacts that do not have interactions after the specified date or for date range.
- Remove only interactions after the specified date or for date range. Rebuild of all calculated facets is triggered automatically for all contacts when the process finishes. Contacts are not removed in this case.
- Rebuild the selected calculated facets.
- Remove all data for a specific contact defined by the contact id or source and identifier.
- Check if the contact with specific id or source and identifier exists.
- Retrieve the xDB statistics.
- Retrieve xDB Index statistics.
- Find outliers in the database.
- Authentication:
In order for calling any API, we need to authenticate the request. To get the access token, we can enable non-interactive client login. We can refer this article to enable it in Identity server. Once enabled, we will get Client_ID and Client_Secret. With that, we can call login API to get the Access Token. - Generate Latest xDB Outliers List
Using this API, xDB outliers list can be generated. It is a task and with this API, you can let ADM start this process.
This process may take minutes to hours based on the size of the xDB. It is better to start this task and we can get the status of the task using another API. Before we run the 3 step, we can execute the below API to get the status and if it is completed, we can start the next step. - Delete the Outliers
In this step, there is an API to get the Outliers. Once we get the Outliers, we can call another API to delete the contact with the Outliers contact IDs. Deletion is also a task so once we execute the delete request, we need to check if the task is completed using the 3rd API which will give the status of the deletion request. Once it is completed, we can repeat the whole process.
Get the Outliers:
Delete the Contact by IDs
Check if the Deletion request is completed
Repeat this step till we clear the contacts with abnormal number of interactions. This process can give us an overview on how we can find the Outliers using ADM and then delete it.