Thursday afternoon around 16:25 CET, the AppServers request queues started filling up across all servers, and an alert was dispatched to our operations team.
We had rolled out new database changes earlier in the week, these along with other factors like automated database backups and high traffic resulted in slower queries for enrichments, when we reached peak traffic Thursday, the queries were long enough that the request queues started to get full very quickly.
Our operations team quickly reacted to the error codes we were seeing and scaled up to double capacity to mitigate the immediate issue. Shortly after the new servers came online we were seeing request queues starting to drop, and issues were resolved for customers.
We then identified the database query in question, and it was resolved to a point where we see a significant lower load than previous.
We are bringing in additional monitoring, and are looking closely for similar degradations.