XorByte

System Design Interview - Databases Part 2

May 06, 2020

Introduction

This is part 2 of the series of System Design Interview, In this post we are going to understand the diffferent types of database systems i.e Relational and NoSQL. Also we are going to look at some details on how data is stored in these databases and what are the further subtypes of these databases. Lets get started !

Databases - System design interview

Types of Database Systems

Broadly we can divid Database systems in 2 types, Relational and Non-Relational. Lets look at some details of both of these.

Relational

Those database systems which organize data in form of tables(Rows and Columns) and these tables can have relationships with each other. If you need data from 2 tables then you perform the join operation based upon some key. In relational databases you have to design the schema and this should be very well thought in advance because changes to the schema are expensive in terms of effort. Relational Databases have strong consistency and integrity rules. They have been tried and tested for long time and are still being used by huge enterprises. Examples : SQL Server, MySQL, Oracle, DB2 etc are relational Databases.

NoSQL(Not Only SQL) or Non-Relational

No-SQL Database systems store data in variety of formats like Key-value, Document or wide column or graph. Generally speaking, you don’t need joins in such systems because all the related data is kept in one place. There is no predefined schema, i.e the structure of the data can change. You don’t need to have all the columns for a particular row as opposed to RDBMS where you need to have NULLs corresponding the column whose value is non existent for a particular row. The flexibility and simplicity of these systems make it attractive for developers and because of Scalability and Availability concerns NoSQL databases are gaining popularity. Examples : MongoDB, DynamoDB, Redis, Neo4j

Which one to choose and when ?

The decision between NoSQL and Relational DBs need a very close introspection over following points:

  1. What is the nature of the data your application is going to generate/consume? Does it have a well defined structure? Is it going to change frequently?
  2. What is the scale of data? How many reads and how many writes per second is your database server going to handle? What is the read to write ratio?
  3. Which 2 of these your system has to support(Consistency, Availability, Partition Tolerance) ?
  4. What is the number of concurrent requests to your DB server?

Reasons for SQL:

  • Structured data
  • Strict schema
  • Relational data
  • Need for complex joins
  • Transactions
  • Clear patterns for scaling
  • More established: developers, community, code, tools, etc
  • Lookups by index are very fast

Reasons for NoSQL:

  • Semi-structured data
  • Dynamic or flexible schema
  • Non-relational data
  • No need for complex joins
  • Store many TB (or PB) of data
  • Very data intensive workload
  • Very high throughput for IOPS

Now don’t just say I am gonna use NoSQL database, understand the requirements of your application, A relational DB could be very well suited for your case, If you choose NoSQL over Relational or vice versa, be prepared to provide detailed reasoning and rationale in context of the application you are going to design in the interview.

This completes the Part 2 of the series, Stay tuned, we are going to get in to the details of scaling database systems for our interview in the next article ! Till then Enjoy and keep learning :)

Ashish Kumar Singh , I am a Software Engineer, I 😍 Code. [Twitter] [Linkedin]