Monday, July 27, 2015

[Status 10:40 AM EDT] Graph data delayed and Intermittent Timeouts - Additional details

Customers continue to see delayed data in charts, timeouts and degraded performance. We do not yet have an estimated time for resolution.

We have all available resources working around the clock to resolve the issue as quickly as possible.

Here is some more context:
  • We began to see reduced performance in the Cassandra database cluster that stores and serves metrics for charts several days ago.
  • The team initially attempted to restore that cluster to full performance.
  • We have decided that the best path forward is to replace the cluster and are in the process of doing that.  
  • When the new cluster comes online, you will initially see metrics for the most recent 24 hours. 
  • We archive all metrics that we receive and will backfill historical metrics from the archive when the cluster comes online.  This will take some time.
  • We are still receiving and storing metrics from your environment.
  • The alerting system continues to operate normally.