Coveo planned to deprecate their Coveo for Sitecore endpoint and requested clients to use Organization specific endpoint for better control over traffic and rate limiting. I wrote a
blog article earlier and provided the steps to migrate to the organization specific endpoint.
Coveo for Sitecore - Endpoint Deprecation & Migrate to Organization Specific Endpoint:
https://www.nehemiahj.com/2024/08/coveo-for-sitecore-endpoint-deprecation.html
Even after migrating to the right organization specific endpoint, we were still informed that we are using the deprecated endpoint and we may have issues to analytics and search related API calls. In Coveo Administration Console, Visit browser / event browser feature does not show the endpoint being used for the incoming traffic.
So every time, we wanted to test whether it reaches the right endpoint, we worked with Coveo support and they provided the results. Results were contradictory. Sometimes it reached the right endpoint and sometimes it reached the wrong one 🤷♀️. Our setup is described below.
- We have Cloudflare as our entry point.
- Azure Front Door is used for routing as we have multiple legacy sites and Sitecore sites.
- Sitecore site is hosted in Azure VM as IaaS setup.
- Coveo for Sitecore module is used to setup the Coveo related pages.
As Coveo support requested, I did few changes to Cloudflare and Azure Front Door.
- Bypass Cache for Coveo requests in Cloudflare. Bypassed for any URL path /coveo/rest/*
- Disable cache and WAF in Azure Front Door - We didn't use WAF and Cache in Azure Front Door. So no changes were implemented.
Scenario 1: Disabled the Sitecore proxy
Result: All Coveo requests reached the correct organization endpoint. I could even see it in the network tab as Coveo requests are routed directly from client browser to Coveo organization endpoint domain.
Scenario 2: Enabled the Sitecore proxy & requests from internet
This is a regular scenario where proxy is enabled in Sitecore and requests were made from internet.
Result: All Coveo requests reached the deprecated endpoint. Coveo requested us to implement Bypass cache in Cloudflare and disable cache & WAF in Azure Front Door as mentioned above. But Coveo support mentioned that requests are sent to deprecated endpoint.
Scenario 3: Enabled the Sitecore proxy & requests sent inside Sitecore Server
This was just a random request I made and Coveo support checked their side whether it is sent to deprecated endpoint or not.
Result: All Coveo requests reached the correct organization endpoint.
Coveo support brought in core engineering team to the discussion and they were skeptical that some rules in Cloudflare or Azure Front Door transforms the domain to the wrong domain. I was sure that there is no outbound rules which could have caused this issue. They asked us involve our network team to investigate as there is no problem with Coveo.
In-Depth Analysis
Enabled Coveo Debug Logging:
We enabled Coveo Debug logging to understand the domain which is used to send the Coveo proxy request. To enable debug logging, you can modify the <log4net> <logger> section in this file \App_Config\Include\Coveo\Coveo.SearchProvider.Custom.config.
After enabling debug logging, Coveo related logs were captured in Coveo.Search.txt file. It clearly showed that we are sending the traffic to Organization specific endpoint (which is the correct endpoint) in all the scenarios. We had to prove that to Coveo by other means.
Captured Coveo Header Information with Domain:
Since I cannot install any sort of network tools in the server, I created a simple .NET app to read the incoming request and log the header details in a text file. I hosted it in a separate development Azure VM with public access. The bindings for this app are Coveo deprecated endpoint and Coveo organization specific endpoint domain. In the Sitecore Azure VM, I added the Coveo organization endpoint domain and deprecated endpoint domain in the server host file pointing to the development VM.
So any requests to Coveo deprecated endpoint and organization endpoint will reach the app hosted in the development VM. I logged the entire request header information in both the scenarios. From my side, I confirmed that all the requests are having the right endpoint so it is not our problem but something to do with Coveo network side. We asked Coveo support to involve their network team and we shared the entire log information.
End Result
Coveo Support found the issue at their side. When the Sitecore proxied request reached Coveo and if it has X-Forwarded-Host header, then Coveo considered as invalid request and routed all traffic to deprecated endpoint. The request did not have X-Forwarded-Host header if we send the request inside the server (without Cloudflare or Azure Front Door) so those requests were marked as valid and routed to organization endpoint. Unfortunately we do not have any control to customize the request to Coveo using Sitecore proxy as it is part of the Coveo for Sitecore module.
On Nov13th 2024, Coveo support released the fix at their side and we gave a couple of Visit IDs to test and they confirmed that it reached the organization endpoint. Also, Coveo extended their deprecation deadline to one last time to June 3rd 2025.
The X-Forwarded-Host (XFH) header is a
de-facto standard header for identifying the original host requested by the
client in the Host HTTP
request header. Host names and ports of reverse proxies (load balancers,
CDNs) may differ from the origin server handling the request, in that case
the X-Forwarded-Host header is useful to determine which Host was
originally used. -
Mozilla Developer Docs.
Since we have Cloudflare and Azure Front Door, the X-Forwarded-Host header is added to the request and it reached Sitecore server. Coveo for Sitecore module uses the headers from the original request, transforms only a set of headers with different value and copies the rest of the headers from original request to the proxied Coveo request.
Once Coveo made changes at their side, this started to work for every Coveo clients.
Our traffic data is also moving towards organization specific endpoint.
Happy Customers 😊😊😊