How I Implemented Redis Caching in NestJS
Database round-trips were the bottleneck in our NestJS API. Here is how a disciplined cache-aside layer dropped P95 latency from 340ms to 28ms without making invalidation painful.
Notes on backend engineering, system design, and the practical tradeoffs behind shipping software that holds up in production.
Database round-trips were the bottleneck in our NestJS API. Here is how a disciplined cache-aside layer dropped P95 latency from 340ms to 28ms without making invalidation painful.
I have shipped REST, GraphQL, and now tRPC in production. For an app where one team owns both the frontend and the backend, tRPC is the most productive by a wide margin.
When I joined, the cluster was a single replica set handling 40K daily users. 18 months later we were at 200K with a 2x-improved P95. Here is what actually moved the needle.
Zero downtime sounds like magic. It is mostly discipline: reload instead of restart, health checks before trusting the new version, and enough instances that one going down does not matter.