Timeouts Publishing Events to Keen
Incident Report for Keen
Postmortem

We ran into some operational issues today in our production environment as we saw a very large spike in inbound requests to our API.  This manifested as a partial outage from 10:20am -> 11:06am and 12:55pm -> 1:07pm PT.  We are currently stable and the team is continuing to provision more capacity to allow us to keep our service running smoothly.  There was no data loss for events that we acknowledged.  There were a number of writes and queries requests for writes which ended up failing which was reflected back to our clients. This also extended to some issues with our website.  This means if you assume that a request to Keen IO will always succeed, you may discover some of your data was not recorded. Tomorrow we expect an increased load throughout the day but believe our increased capacity will allow us to handle the traffic.  The team will be closely monitoring the situation and responding as appropriate.

Posted Oct 04, 2016 - 16:04 PDT

Resolved
This incident has been resolved.
Posted Oct 04, 2016 - 16:03 PDT
Update
We continue to experience intermittent spikes in inbound event volume that are leading to connection timeouts. Our efforts to provision additional capacity are ongoing. We appreciate your patience and apologize for the inconvenience!
Posted Oct 04, 2016 - 13:15 PDT
Monitoring
We're no longer seeing any performance issues in our system. We've identified the cause as high incoming event volume. We're also working on adding more capacity so that we can handle such load bursts in the future.
Posted Oct 04, 2016 - 12:25 PDT
Investigating
We're investigating a higher than usual number of timeouts for events published to Keen. There is no loss of data if an event is accepted but some requests to publish events might fail.
Posted Oct 04, 2016 - 10:53 PDT
This incident affected: Stream API.