Scalability in Performance Engineering
There is a common misconception that just by adding more hardware you can increase the throughput of your application. Yes indeed hardware is very cheap these days but things are not as easy as adding more hardware to improve the performance. You could have so many layers in your solution, web servers, application servers, database servers, authentication servers etc. Lets say currently your solution is supporting 50 transaction per second (tps) and you have 20 servers spread across different layers. Now there is a need to support 100 tps.
Are you going to add 20 more servers since the load has doubled?
Are you sure that the application will scale up to 100 tps with added hardware?
Lets assume we have a perfectly scalable application. The first step is finding out which server is the bottleneck. Lets say the authentication server is at 90% CPU busy during peak load and is the only bottleneck in the system. Probably all you need is to add just one more authentication server and you could support 100 tps. You would be wasting time, money and resources by buying 20 more servers instead of just a single server which would have supported the 100 tps load. That is where Performance Engineering comes into picture. Performance engineer is responsible for determining the scalability of the system and determine the bottlenecks both in the hardware resources and the software application.
There are two types of scalability, Vertical and Horizontal. Vertical scalability involves verifying that the software scales up on adding more resources ( cpu, memory, io, network) within a single server machine. Horizontal scalability involves verifying that the software scales up by adding more physical servers machines ( probably balanced via a load balancer ). You determine the vertical scalability of the software by benchmarking on a particular hardware ( cpu, memory, io and network) for the maximum throughput and increasing the relevant resources and run another benchmark for the maximum throughput. If the software scales proportionately then it is scaling vertically. Most of the times in real world you do not have time to test for vertical scalability and you just find out what is the maximum throughput on a single server and then see if it scales horizontally. Testing for horizontal scalability requires adding additional servers. Try to get three points in a graph to better understand the scalability. For e.g on a 6 server architecture you could get benchmarking results on 1 server, 3 server and 6 servers and plot the throughput graph and understand if the software scales. If the software is not scaling proportionally or not scaling up at all then there is probably a software bottleneck. If none of the hardware resources are found to be a bottleneck then we have a tough task at hand to determine the software bottleneck.
If you have a bottleneck in your software it will not scale up in spite of throwing more hardware at it. The most difficult and crucial part is finding and removing bottlenecks in the software. In one of the critical applications I had been working on, during my scalability tests I saw that with increased load the throughput was not increasing and the response time was growing linearly. This seemed to be a classic software bottleneck and I used all the tools at disposal, resource utilization tools, profiling tools, timestamp analysis tools etc. to narrow down where the problem is. The problem was in a method called “processMessage” within a class. This method was the entry point for multiple threads and there was one big lock sitting in there. This class had a member variable as a DOM object and in order to make the method “processMessage” as threadsafe there was a mutex surrounding the processing of this DOM object. This would serialize all the requests instead of processing them simultaneously in different threads. There was no need to have the DOM object as a member variable. All that was required was to instantiate the DOM object within the method “processMessage” , remove it from being a member variable and remove the mutex. This was a huge bottleneck and was due to careless programming and it would have been a disaster if the product went into production without a fix. A simple fix resolved the problem. Lot of times this is not so easy and needs much more time and repeated tests to narrow down the bottleneck. Just imagine if you have to find a software bottleneck in production environment. The customer impact could be huge.
The bottom line is that adding hardware is not always the right solution. Yes if your solution is scalable then adding additional hardware will increase the throughput. But you still need to know which hardware to add. Sometimes adding hardware also does not improve performance so one needs to know what are the bottlenecks within the software. This is one of the most difficult and challenging aspects of performance engineering. A good performance engineer will be able to determine by monitoring the resources if there are hardware bottlenecks and after resolving hardware bottlenecks if the software does not scale up he will be able to determine the software bottlenecks and resolve it for improved scalability. Bottlenecks sometimes keeps on moving from one layer to another. You might resolve a bottleneck in application layer and then the bottleneck might move to the database layer. That is what makes performance engineering so challenging and interesting.
Tags: architecture, bottleneck, hardware, hardware bottleneck, horizontal, horizontal scalability, linkedin, monitoring, resources, scalability, scalable, software, software bottleneck, throughput, tps, vertical, vertical scalability