We developed a website for a well-known european sports institution which wanted to make sure that it could handle a load of 9'000 visitors/sec. We knew the load could be quite important on match nights and had the mission to optimize the servers to stand at best against it.
Architecture
We have 2 servers :
- a backend with Apache and MySQL installed
- a frontend with Varnish installed
Methodology
As we didn't have any idea of how many requests the servers could handle, we started by benchmarking the application in its current state. We used Apache ab
for that. To reduce the networking effect and test the real throughput of the server, we did the benchmark from the backend server to the frontend.
The ab
tool was executed using the following options :
- limit of 6000 requests
- keep-alive activated
- 500 concurrent requests (although we tested with more/less but it didn't change the results)
Solution
Here's the ab
output for one of the match page before optimizing anything :
Document Length: 7579 bytes
Requests per second: 11148 [#/sec] (mean)
Time per request: 44.849 [ms] (mean)
The first idea we had was to try to tune Varnish, Apache and MySQL. Basically what we did was increasing the number of concurrent requests Apache can handle, making MySQL use more RAM (eg. for query caching, key buffering, etc), and following some Varnish tuning guide. That gave us the following results :
Document Length: 7877 bytes
Requests per second: 14115 [#/sec] (mean)
Time per request: 35.363 [ms] (mean)
The results were not really what we expected: we spent some hours trying to fine tune Varnish/Apache/MySQL configuration for a 26% req/sec gain. After some more testing we discovered that the benchmark was not fetching the gzipped version of the pages. Here's the ab
output for the homepage after adding the gzip request header :
Document Length: 2420 bytes
Requests per second: 41412 [#/sec] (mean)
Time per request: 12.074 [ms] (mean)
That's a lot better. Although we may think that handling 41'000 requests is enough based on the client's request, that goes without taking into account the fact that every visitor will do several requests (at least 10 requests in our case : 5 CSS files + 4 JS files + initial request). That reduces the number of concurrent requests to about 4'500, which is not enough. Remembering Pierre's presentation about speedy apps, we decided to apply some of his tips.
We used YSlow to give us a summary of the website good/bad points in content delivering and help us optimize it. It reported that we were not using aggregated JS/CSS files, and that Expires headers were not configured. We started by aggregating and minifying the CSS and JS files. That led us to 2 files instead of 9. After that we configured Expires and ETag headers correctly. That led us to an additional reduction of 5 requests per page once the visitor has a cached copy of the file.
With these simple steps we already saved 7 requests per page, or 13 once the user has a cached copy of the static files. In the end, the visitor will only do close to 1 request per page. That means a faster rendering on the client side, and less load on the server side.
Conclusion
Our scenario was facilitated by the fact that the website is a sport events website. That means that load peaks will come on the same resources (ie. recent/live match results), leading to a very good Varnish cache hit ratio (more than 90%). If that weren't the case, our benchmark should have included other resources as well to generate a more realistic usage of the website.
As we have seen, fine tuning servers take some time and is not always rewarding. On the other side, trying to reduce the number of requests done by the visitors is an easy solution to provide a better browsing experience and relieve the server. This aspect can't be seen from a server benchmarking tool point of view but is still very important and should be tried first because it will give results for sure.