Database round-trips were the bottleneck in our NestJS API. After adding a Redis cache layer, P95 latency dropped from 340ms to 28ms. Here is what actually moved the needle.
The problem
Our analytics dashboard was hitting the same endpoint on every page load, and each request triggered three joined queries across two collections. With 200K daily actives, we were pushing MongoDB into the 80th percentile of its CPU budget by 10am every day.
The fix was obvious: cache the response. The interesting part is how we cached it without making invalidation painful.
Cache-aside, not write-through
I picked the cache-aside pattern: the service reads from Redis first, falls back to MongoDB, then populates the cache. Write-through is tempting but couples your writer to cache health, and I do not want a Redis outage to block writes.
async getReport(userId: string) {
const key = `reports:summary:${userId}`;
const cached = await this.redis.get(key);
if (cached) return JSON.parse(cached);
const fresh = await this.reportsService.compute(userId);
await this.redis.setEx(key, 3600, JSON.stringify(fresh));
return fresh;
}
Key naming is a policy, not a guess
Flat key namespaces are a lie. Three months in, you will not remember which service owns which key. I enforce a prefix-per-service convention and keep the builders in one file so nothing drifts.
TTLs from the access pattern
Every key gets a TTL chosen from the data, not pulled from thin air. For the reports endpoint, upstream data refreshes hourly, so a 60-minute TTL means at most one stale read per user per hour. That is a trade the product team accepted up front, which is the important part.
Results
- P95 latency: 340ms to 28ms
- MongoDB read ops: down 78 percent
- Redis memory footprint: 140MB for 200K active users
The lesson was not "caching is fast." It was that a disciplined key structure and a conservative TTL policy turned a sometimes-unreliable optimization into a boring, predictable one.