Imagine you have an application designed and running that enables you to chat and share photos, videos, and blogs with your friends. After a couple of months, you and your friends started liking that app. Now you are trying to release your app in public just like Facebook, Instagram, etc.
After releasing your app in public. it quickly gains popularity and many people started using the app. Imagine previously when the app is just used by you and your friends, traffic will be less and your server handles those requests without much trouble.
Now we have lots of requests coming to our server and our server is not able to handle these requests due to which the server went down and people started complaining.
Not a good scenario, is it?
No need to panic. we have two solutions here:
- Buy a bigger machine to handle these request
- Buy more machines, connect them and handle these requests
Scalability
The ability to handle more requests by buying either a bigger machine or by buying more machines and connecting them together is called Scalability.
Buying a bigger machine => Vertical Scaling
Buying more machines => Horizontal Scaling
Horizontal Scaling
Here we buy more machines to handle the requests. To integrate these machines together we need a load balancer.
Since we have many machines, even if a server in one machine goes down, the load balancer will forward the request to the server of any other machine making this architecture more resilient.
The communication between these servers needs network calls. This might slow down the process
We always need to ensure that the data is synchronized between all these servers and this will increase the maintenance cost. If there is some synchronization issue occurs, data will be inconsistent and this will be a huge issue.
Since we don't use a single machine there are no hardware limits, in other terms, this architecture scales well as user limit increases.
Vertical Scaling
Here we are buying one bigger machine which means we are increasing the CPU, RAM capacity of a single machine. so no external load balancer is required here to navigate the request across the different servers.
We are more dependent on a single server, so if that server went down due to some issue, our whole application will face a downtime. Single point failure :(
No need for network calls, interprocess communication occurs within this server.
Since every operation is carried on in a single server, no need to worry about the synchronization of data. In other terms, data consistent model.
There are certain hardware and software limits for a single machine. We will not be able to purchase hardware and software more than this limit. This may lead to some scalability issues.
So which one to use?
There is no one particular solution for this. This all depends on our business and how we need to handle the requests. It's always better to choose a hybrid considering the good points of both horizontal and vertical scaling.
In vertical scaling, we can reduce the data inconsistency and make use of interprocess communication whereas in Horizontal Scaling there is no single point of failure and it also scales well as user limit increases.
Start your business with vertical scaling and once the users/ customer starts trusting you, we can scale out to Horizantal.