System Design Interview - Caching and Scalability

May 09, 2020

What is Caching?

Caching is just storing important, Frequently used data or Heavy to compute data in a faster storage, so that subsequent access to this data is faster.

How does Caching work?

You have a fast storage say RAM, you store your data in RAM and your CPU can access is faster than it will access from the disk. But the RAM is expensive, so you can store only a subset of data. In computer systems L1 and L2 caches are there, then you have the RAM and then there is disk in the memory order. In applications, you precompute results of a heavy operations beforehand and serve it, rather than doing the same heavy computation again and again.


Caches are used in scalable systems in networking layers, in CDNs(Content Delivery Networks), DNS, Web Apps, and Databases. The main purpose of a cache is to reduce latency and improve IOPS(Input Output operations per second)

Caching Best Practices:

  • Cache Eviction Policy: When will you update the cache and remove an element which is stored in a cache?
  • Cache information could be stale, so understand when and where to apply caching and to which type of data. Don’t use it for critical data and the data which changes rapidly.
  • Expiration period of cache items must be decided meticulously or you will lose the benefits of caching.

Cache Policies


Cache writes data to cache and storage. The advantage here is that because newly written data is always cached, it can be read quickly. A drawback is that write operations aren’t considered complete until the data is written to both the cache and primary storage. This can cause write-through caching to introduce latency into write operations

Write Back

Write-back cache is similar to write-through caching in that all the write operations are directed to the cache. However, with write-back cache, the write operation is considered complete after the data is cached. Later on, the data is copied from the cache to storage.

Write Around

Write-around cache writes operations to storage, skipping the cache altogether. This prevents the cache from being flooded when there are large amounts of write I/O. The disadvantage to this approach is that data isn’t cached unless it’s read from storage. That means the read operation will be relatively slow because the data hasn’t been cached.


A particular kind of cache which comes into play for sites serving large amounts of static media is the content distribution network. In a typical CDN setup, a request will first ask your CDN for a piece of static media, the CDN will serve that content if it has it locally available (HTTP headers are used for configuring how the CDN caches a given piece of content). If it isn’t available, the CDN will query your servers for the file and then cache it locally and serve it to the requesting user (in this configuration they are acting as a read-through cache).

Benefits of Caching

  • Improve Application Performance
  • Reduce Database Cost
  • Reduce the Load on the Backend
  • Predictable Performance
  • Eliminate Database Hotspots
  • Increase Read Throughput (IOPS)

Ashish Kumar Singh , I am a Software Engineer, I 😍 Code. [Twitter] [Linkedin]