A brief update with some numbers for hardware load-balanced mongrels

Back in August, I posted about a good-sized evaluation I was going to start doing about the horizontal scaling of different proxy engines and load-balancers across lots of mongrels.

But in short, we’ve stayed with F5’s BIG-IPs for at least one additional reason beyond their ability to handled gigabits of traffic across many many backend servers, and that additional reason is the ability to see inside packets and write iRules. iRules that can scrub http-reponse outs for credit card numbers and replace them with a “NOT ALLOWED TEXT”; iRules that can direct API traffic, bot traffic and traffic from specific user-agents to different backend pools (so you can separate API traffic from what people are hitting in a web browser), and iRules that can send .gif, .jpeg, .png off to different backend pools. All without any code changes, and this is just touching on some of the capabilities.

And to give you a general performance example, we recently added another 48 mongrels on 6 Joyent Accelerators behind a single public IP address to a set of BIG-IPs that are already directing 100s of Mbps of traffic, and without even getting around to tuning yet or caring about the backend numbers and configuration, we’re able to do quick 1 second bursts (like below), and then 10 seconds, 100 seconds, 1000 seconds, and one hour running benchmarks and get numbers like these below:

$ httperf —hog —server adomainname.com —port 80 —num-conn 4000 —rate 4000 —timeout 5 —uri=/theuri
httperf —hog —timeout=5 —client=0/1 —server=adomainname.com —port=80 —uri=/theuri —rate=4000 —send-buffer=4096 —recv-buffer=16384 —num-conns=4000 —num-calls=1

Total: connections 4000 requests 4000 replies 4000 test-duration 1.023 s

Connection rate: 3908.6 conn/s (0.3 ms/conn, <=99 concurrent connections)
Connection time [ms]: min 2.0 avg 6.0 max 66.3 median 3.5 stddev 9.4
Connection time [ms]: connect 0.5
Connection length [replies/conn]: 1.000

Request rate: 3908.6 req/s (0.3 ms/req)
Request size [B]: 85.0

Reply rate [replies/s]: min 0.0 avg 0.0 max 0.0 stddev 0.0 (0 samples)
Reply time [ms]: response 4.8 transfer 0.7
Reply size [B]: header 216.0 content 7850.0 footer 0.0 (total 8066.0)
Reply status: 1xx=0 2xx=4000 3xx=0 4xx=0 5xx=0

CPU time [s]: user 0.39 system 0.59 (user 38.5% system 58.0% total 96.5%)
Net I/O: 31112.6 KB/s (254.9*10^6 bps)

Yes that’s the sub page of a rails application doing almost 4000 requests/second. But it’s also a small page with a blip text and a couple of small small avatarish images.

22 responses to “A brief update with some numbers for hardware load-balanced mongrels”

  1. Very impressive. How are you handling static files in a setup like this? Just letting the mongrels do static? or using an asset_host within rails?

  2. In this example, for kicks and to start, it’s all mongrels.

    The next general tuning setup is iRules for static assets coming from different backend pools.

    Ultimately one can use the RAM-cache that’s present as well to cache things by URI or extensions but honestly different browsers don’t always like that. I tend to prefer the above.

    The BIG-IPs can also simulate pipelining to browsers, or one can change their application to use the distributed assets method and get it streaming out that way.

  3. But Rails doesn’t scale.Seriously, if I get asked that question again I will direct them here. Nice one.

  4. That’s great. Now please get my accelerators using that Big IP—i’ve been waiting a couple weeks now. I want 4000 r/s!

  5. Sounds great. We’d love to get some Big IP action too!

  6. Nice! Out of curiosity: how many machines are the 48 mongrels running on?

  7. i would have never thought it would be possible. Have you tested it with pages full of objects, associations, media inside an app that has plugins … ie a ‘real’ page?

  8. What website is all of this for?

  9. But it’s also a small page with a blip text and a couple of small small avatarish images.

    Could that be… Twitter?

  10. We’ve got a BIG-IP as well, and the iRules are one of the main reasons we use it. Mentioned this on another post of yours, but how would you recommend measuring the impact of BIG-IP acceleration?

  11. No PS Pipe Grep this week?

  12. Hi Vick, we taped one, it should be coming out.

  13. Why Mongrels? Wouldn’t lighty/fastcgi be faster?

  14. In your experience, have you found that the cost-outlay involved in scaling Rails applications (more hardware required) is acceptable due to the reduced development costs? And before you say that “hardware is cheap”, obviously “less hardware is cheaper”.

    This isn’t a bait, I’m just interested in whether or not you’ve sat back at some stage and said “damn, this requires a hell of a lot of hardware to scale”.

  15. Yeah, that rocks. 4000 r/s is really a lot. Wondering if there is more possible with the same hardware.

  16. @ Nathan, I haven’t really seen a greater cost-outlay relative to say a PHP application. It’s a combination of the fact that most computers are more then powerful enough, and that often the bottlenecks are really traffic direction and load-balancing issues (both in front of app and in front of db).

    @Dieter Yes more is definitely possible, this is only ~1/10th the switching capacity of a big-ip. As that goes up, then it would be time to attend to the databases.

  17. @Jason

    >> “I haven’t really seen a greater cost-outlay relative to say a PHP application.”

    I’m sorry for asking this, but I don’t understand.

    Are you saying that scaling Ruby on Rails is extremely expensive or inexpensive?

    And are you saying that PHP costs more or less to scale than Ruby on Rails.

  18. @ Blake, I don’t think it’s expensive, and there’s not much of a difference in the costs of scaling rails versus php or anything else. Not when compared to what people costs, which is where it begins to get expensive.

  19. Jason,

    Out of interest, did you look into the Citrix Netscaler at all?

    We run them at work in front of our web server and Citrix application clusters with superb results.


  20. […] about the benchmarks that much because you can find ways to make anything fast as shown with Rails doing 4000 requests a second. Tagged: rails rails performance test […]

  21. […] a difference in the horizontal scalability for how many rails processes you can hit in the backend (previous joyeur). The limit is typically <1000 req/second and not that many mongrels, so it’s pretty easy […]

  22. […] Ah, it seems the TextDrive crew is quite busy pushing the envelope for high-end Rails hosting (via Tim Bray). It is quite impressive, but it’s also disappointing – I’d rather […]