Skip to content

Instantly share code, notes, and snippets.

@cespare
Created September 27, 2012 11:39
Show Gist options
  • Save cespare/3793565 to your computer and use it in GitHub Desktop.
Save cespare/3793565 to your computer and use it in GitHub Desktop.
A Simple Webserver Comparison

This is a very simple benchmark comparing the response times of a few different webservers for an extremely simple response: just reply with a snippet of static json. It came up in discussion of a real-life service: in the actual server, a long-running thread/process periodically updates the state from a database, but requests will be served with the data directly from memory. It is imperative, though, that the latencies be extremely low for this service.

This comparison was partly inspired by this blog post.

Method

The code for the various servers may be found here. To conduct each test, I ran the server on a Linux desktop machine and then ran ab (apachebench) against the server from another machine connected via gigabit ethernet on the local network.

Server specs:

  • Ubuntu 12.04
  • Intel Core i5-2500K CPU (3.30GHz, 4 cores)
  • 8GB RAM

For each test, I made 20,000 requests with 1, 10, 100, and 1000 concurrent connections. I did 3 warm-up runs before collecting data. Below are the mean, median, 90th percentile, 99th percentile, and max latencies.

Contenders

I looked at both using Sinatra and writing Rack applications directly. For each of these options, I tested with Racer, Thin, and Unicorn. (This comparison isn't quite fair because I ran Unicorn with 8 workers, whereas Racer and Thin only have 1 worker.) Also in Ruby-land, I tested Goliath.

I was also interested in trying some JRuby servers, so I ran the plain Rack app under Trinidad and mizuno as well.

Additionally, I made a Scala app (using Scalatra) and a Go app (using the net/http standard library package).

Software versions:

  • Ruby 1.9.3-p194
  • JRuby 1.7.0-rc1
  • Scala 2.9.2
  • OpenJDK 1.7.0
  • Go 1.0.3

Numbers!

The c value is the number of concurrent requests. 90% and 99% are the 90th and 99th percentiles. All values are milliseconds.

c = 1

Servermeanmedian90%99%max
Sinatra + Racer000111
Sinatra + Thin10119
Sinatra + Unicorn111712
Rack + Racer00007
Rack + Thin000012
Rack + Unicorn11157
Goliath00019
JRuby + Mizuno111112
JRuby + Trinidad00019
Scalatra11113
Go00014

c = 10

Servermeanmedian90%99%max
Sinatra + Racer223810
Sinatra + Thin323914
Sinatra + Unicorn1121749
Rack + Racer11125
Rack + Thin11148
Rack + Unicorn1121447
Goliath222812
JRuby + Mizuno112825
JRuby + Trinidad101413
Scalatra32459
Go11113

c = 100

Servermeanmedian90%99%max
Sinatra + Racer1837123618
Sinatra + Thin2525293033
Sinatra + Unicorn1715223447
Rack + Racer51241623
Rack + Thin1010131718
Rack + Unicorn1413162742
Goliath2324252533
JRuby + Mizuno10510271214
JRuby + Trinidad6671417
Scalatra222025301016
Go44579

c = 1000

<tr><td>Sinatra + Thin</td><td colspan='5'>server failed</td></tr>
<tr><td>Sinatra + Unicorn</td><td colspan='5'>server failed</td></tr>
<tr><td>Rack + Racer</td><td>45</td><td>1</td><td>3</td><td>1460</td><td>6459</td></tr>
<tr><td>Rack + Thin</td><td>90</td><td>13</td><td>19</td><td>2468</td><td>6462</td></tr>
<tr><td>Rack + Unicorn</td><td colspan='5'>server failed</td></tr>
<tr><td>Goliath</td><td colspan='5'>server failed</td></tr>
<tr><td>JRuby + Mizuno</td><td>76</td><td>6</td><td>29</td><td>1229</td><td>2462</td></tr>
<tr><td>JRuby + Trinidad</td><td colspan='5'>server failed</td></tr>
<tr><td>Scalatra</td><td>231</td><td>238</td><td>273</td><td>1150</td><td>1414</td></tr>
<tr><td>Go</td><td>20</td><td>17</td><td>21</td><td>34</td><td>642</td></tr>
Servermeanmedian90%99%max
Sinatra + Racer1604100124386068

Implementation impressions

  • At this time, racer seems much more like a proof of concept than a serious production-ready webserver.
  • Rack is great, because it's super easy to drop in various webservers to run your app.
  • JRuby is really easy to use, and plays well with rbenv and bundler.
  • Deployment for JRuby apps may get complex, what with xml files and tomcat configuration and who knows what. Projects like Warbler that turn your whole project into a war file may help a lot though.
  • JRuby startup time is really annoying.
  • Scala, as usual, is a massive pain to set up and get running. The "minimal" example project for Scalatra required three different tools to set up and configure, and the sbt configuration makes the whole thing a real mess that's very newcomer-unfriendly. There are 13 files in this project, compared with around 2-4 per other implementation.
  • I wanted to try Lift as well but the setup was too daunting. sbt is awful.
  • Go is really great for this kind of thing. The server is dead-simple, configuration is non-existent, and the app is built to a single binary ready to be deployed to a server.

Conclusions

  • Avoid Sinatra if latency on the order of a dozen milliseconds matters more than ease of development.
  • Racer has really low latency, but also has a nasty habit of serving a small percentage of requests really slowly as load increases (see c=100, where the rack/racer combo served 99% of requests in 4ms but the max latency was 1623ms).
  • Assuming we rule out racer (due to lack of documentation/polish/project momentum), thin seems like the best choice for low-latency performance if you want to stick with MRI.
  • Both JRuby servers performed well; Trinidad, in particular, seems like a good bet for low latency. It did stop responding to requests once the concurrent requests got up to 1000, though.
  • If you're optimizing for latency, Goliath is a strictly worse choice than, say, Rack + Thin.
  • Scalatra is a poor choice for latency-critical applications. This could be due to all the framework code (the same as Sinatra), or because my app was not tuned properly.
  • I'd be interested to see how a better Scala app (or a Java app) would perform in the context of a properly-tuned, high-performance Java webserver.
  • Go completely dominates this comparison. It would be a great choice for a small standalone webserver like this where latency matters. Note how the implementation doesn't even use any third-party libraries, yet fits into a 30-line file and builds with no extra configuration files whatsoever.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment