We have written a full postmortem on our blog. We're very sorry for the slow queries and any inconvenience, we take the performance of our service very seriously and are working hard to ensure this won't happen again.
In summary:
To resolve we re-enabled the optimization layer, which allowed us to clear the query backlog and normal performance returned.
In response we've implemented some better monitoring around utilization, improved visibility into rate limits and are working to have better notifications, and getting more information.
If you'd like more details, please read our full post or contact us!