One of our AWS MemoryDB clusters ran out of memory. This cluster stores our table metadata, and a cache of our data index. An issue with the data index cache resulted in the MemoryDB running out memory, which prevented it from responding to requests for table metadata. Without this information, requests for measurement data couldn’t be processed.
The measurement ingestion service was changed to a new Redis client library, utilizing server-assisted, client-side caching (see https://redis.io/docs/latest/develop/reference/client-side-caching/). This caching technique resulted in a steady increase in ElastiCache memory consumption, until it reached 100% memory utilization. At this point, requests for table metadata could no longer be served by the cluster. Without this information, we aren’t able to retrieve raw measurements for processing.
We scaled the MemoryDB cluster to a larger size.