Can Rails Scale? Absolutely!
There’s a persistent chatter about scalability in discussions of Rails, especially among people who aren’t actually using it. When we talk with people considering Rails for new sites, concerns about scalability often come up.
In reality, while Rails is not the world’s speediest framework, the supposed scalability issues are very unlikely to be a legitimate reason not to use Rails.
Where Are the Huge Rails Sites?
Some Rails critics attempt to reinforce scalability concerns by noting the lack of large, well-known web sites running Rails. After all, you won’t find Rails behind the scenes at Yahoo, Google, eBay, MySpace, or Facebook.
The major factor here, however, is not scalability, but age. It takes time for sites to become huge and well known, and Rails is one of the newest web frameworks. So of course it isn’t at the heart of the relatively old, well-known web sites.
It is worth noting, however, that there’s strong interest in Ruby on Rails even at many of these large sites. eBay, Yahoo, and Amazon all have Rails projects under way. Facebook is working with Joyent to support Rails apps on their platform. IBM Global Services is building Rails sites for clients. And Thoughtworks is using Rails for a large fraction of its enterprise development projects.
So what are the largest Rails sites? The highest-traffic Rails site we’re aware of is yellowpages.com. This site isn’t in the same league as Yahoo and Google, but it sees much higher traffic than anything you’re likely to build (see figure). The second-highest-traffic site we’ve been able to identify is Revolution Health, headed by AOL founder Steve Case.
Note that this graph shows visits, so you can multiply the figures by perhaps 4 to 5 to get pageviews. That puts yellowpages.com around 100 million pageviews per month (Update: as you’ll see in the comments below, developers at YellowPages.com report that they’re serving around 170 million pages per month), and Revolution Health at around 30 million. Of course, these are unconfirmed, third-party estimates, so they aren’t precise, but they’re probably in the ballpark.
The oldest Rails site of all, Basecamp, is unfortunately hard to measure because its users are spread across many domains and subdomains. Basecamp claims over 1 million users, but it’s impossible to know what that really means.
Twitter is perhaps one of the highest-traffic Rails applications, in terms of transactions per second, but because most of those accesses are not web page views they don’t show up in public measurement metrics. Twitter had some well-publicized problems scaling their Rails infrastructure, but in fact these problems were overcome with a couple months work. Somehow this fact generated less publicity. Twitter remains on Rails.
Some other well-known Rails sites include CNET’s Chowhound, 43things, Spock, and Penny Arcade. At least in how they show up on measurement services such as Alexa and Compete.com, however, their traffic isn’t close to that of yellowpages.com or Revolution Health.
So, for now, yellowpages.com seems like the best proof point. If you have any data to add about these or any other sites, please let us know.
Sources of Scaling Issues
Concerns about Rails scaling probably come, in part, from Ruby’s reputation as a relatively slow language. No one will argue that Ruby is a speed demon. But it’s also true that it is a very rare web application that comes anywhere close to being compute-bound. Database access and network delays are almost always the overwhelming factors. And in the coming year, we’ll see major improvements in Ruby speed, as Ruby 1.9 moves into widespread use and alternative execution environments, including JRuby and Rubinius, reach production systems.
There’s a lot of different things that can slow down a web application. Rails makes it easy to get your application running without worrying about any of the performance issues, and that’s a good thing. If you ignore everything about performance tuning, you’ll still have a working application, and from there you can tune. There’s little point in tuning for performance before you have something that is successful for a small group of people. And if you focus on building an optimally scalable site and end up late to market as a result, you’ll have achieved nothing.
Rails provides a rich set of caching mechanisms that can dramatically increase the speed of most web applications. You can typically achieve dramatic performance gains by applying page, action, and fragment caching. And for big sites, memcached is a popular solution. In many cases, these solutions largely take Rails out of the picture for the highest-traffic pages.
Another common source of performance problems is slow database queries. The ease with which you can access your data without writing any SQL or thinking too much about how associations really work can lead to very inefficient queries. So the next step after implementing caching is to look for queries that are being executed in loops and could be replaced with a single query. A simple change can sometimes yield an order-of-magnitude speedup.
The next step is to look for individual queries that are slow. Often, data can be cached, using memcached or other solutions, eliminating queries entirely. And denormalizing the data can yield huge gains.
There’s a variety of tools available to help you do so. You may need to add databases indexes, or add some :include clauses, or otherwise refactor your queries or your data. These are problems that apply to all frameworks and programming languages. Rails just makes it easier to ignore all these issues if your application isn’t performance limited — which is probably the case for 99.9% of all web applications.
Scaling the Hardware
If you’ve done all this and you’re down to a fundamental database performance bottleneck, you’re neither better nor worse off with Rails than with any other web framework. All the usual solutions, such as adding mirrored read-only database servers, can be applied just as well to Rails apps as to those built with to Java, or PHP, or whatever.
When it comes to handling massive HTTP traffic, you can scale a Rails application horizontally just like any other by replicating your front-end web servers behind a load balancer. There’s nothing about Rails that makes this more difficult than with any other technology.
If you need to scale your Rails processing, you can add more Mongrels to your heart’s content. (Mongrel is the most widely used Rails application server.) It is possible that the overhead of Rails means you’ll need to devote more hardware to executing your application code, but as we discuss in the following section, the time savings of using Rails can pay for a lot of extra hardware.
This kind of horizontal scaling is getting easier and less expensive all the time, as hardware costs drop and “cloud computing” solutions such as Amazon’s EC2 become more widely used. RightScale can automate much of this scaling for Rails applications, even allowing you to quickly add more servers when demand peaks.
The Bottom Line
Scaling a web application to very high traffic levels is hard work, no matter what framework you use. Today, there is more experience in doing so with Java and PHP applications than with Rails applications. But every day there’s more experience with high-traffic Rails sites, and the majority of the techniques used apply equally well to all frameworks and programming languages.
Even if the overhead of Rails does increase your hardware requirements, it doesn’t necessarily increase your total costs. Suppose the ease of Rails development means that you can do the same work with 4 developers that would require 5 with another technology. (Many teams will tell you that this is a gross understatement of the savings, but it’s plenty to illustrate our point.)
Eliminating that single developer from your team will save you somewhere between $100,000 and $250,000 per year, depending on how senior they are, how well you pay, and what your overhead is. But let’s stick with the low end of the range. For $100,000 per year, you can get perhaps two-dozen high-end servers at a first-class hosting facility. That will get you quite a few more Mongrels.
If you’re at Yahoo or Google looking at replacing your core infrastructure, then you’d better look very carefully at scaling issues, and it might be worth using the most compute-efficient language to minimize that amount of hardware you need. But for nearly everyone else, all the concerns about Rails scalability are just noise. Don’t let the FUD keep you from choosing the technology that will maximize the efficiency of your development team.
For Further Reading
See our Performance Tuning section for links to a variety of related resources.
- It’s boring to scale with Ruby on Rails, a July 2005 article by David Heinemeier Hansson
- The adventures of scaling, Stage 1 by Patrick Lenz. He replaced a 50,000-line PHP site with a 5,000-line Rails site.
If you want to read up on the Twitter story, here’s some references:
- Interview with Twitter developer Alex Payne, which is what started the whole controversy
- Scaling Twitter presentation from RailsConf 2007
- Video of Scaling Twitter presentation given at the Silicon Valley Ruby Conference in 2007
- Kevin Clark’s article on the controversy around Twitter’s scaling challenges
Add to the Community Knowledge
If you have first-hand knowledge of traffic numbers at a large Rails site, or stories of scaling challenges and solutions, please add your comment here, or send us an email.