I’ve been asked about this a few times, so I figured I’d post here. This is a brief description of a highly available Rails cluster I’ve built. Some preliminaries:
- There’s no invention here, I believe this setup is very common.
- High availability isn’t the same thing as load balanced. There is nothing here to intelligently shared load across the frontend servers, and one backend server is essentially idle all the time.
- This cluster is built with a bunch of open-source software on non-fancy kit. As such it doesn’t have the enormous capacity of clusters built upon commercial shared-storage products, SAN kit, layer 7 web switches etc. Its ambition is to run a few busy Rails sites well whilst coping with hardware failure gracefully.
Layout
Operation
- Web traffic is spread across the managed frontend interfaces by multiple A records in the DNS.
- Wackamole uses a Spread messaging network to ensure these multiple A record IPs are always present across the frontend. It achieves this by managing the hosts’ interfaces when it detects hosts joining or leaving the cluster.
- A pair of MySQL servers run in master:master configuration on the backend hosts
- The backend hosts use DRBD to maintain a mirrored block device between them.
- These block devices back a NFS filesystem.
- Heartbeat runs on the backend hosts to do several tasks:
- Manage which host is the DRBD primary and therefore can be written to.
- Manage which host has the DRBD filesystem mounted and exported with NFS.
- Manage the IP through which the frontend mounts the filesystem and talks to MySQL.
- With all this in place, Nginx accepts web connections and serves static assets off the NFS mount and passess other requests to Mongrel, a HTTP server that’s well suited to running a Rails instance.
Notes
- One of the main hazards of MySQL master:master setups is primary key collision if an INSERT occurs on both hosts at once. We avoid that here by letting Hearbeat manage the IP that the frontends connect to.
- I’ve built two of these clusters to date. The second one is now four servers wide on the frontend.
Future work
- DRBD can now run in dual-primary mode, allowing both hosts to accept writes. This makes it a candidate for filesystems like GFS that use shared storage to present a filesystem that can be written to on multiple hosts. More here.
- To add some load balancing I’m considering using HAProxy or LVS to actively distribute traffic across the frontends.
- HA aside, there’s also some cool things like evented Mongrel that it would be interesting to try.
Substitute https://trac.example.com
for your Trac instance and drop into a Firefox bookmark, perhaps on the toolbar:
javascript:q='%s';if(q=='%'+'s')void(q=prompt('Trac%20#',''));if(q)location.href='https://trac.example.com/trac/ticket/'+escape(q);else%20location.href='https://trac.example.com/trac/report/1'
Click / select the bookmark to be prompted for a Trac issue number, which you can leave blank to just load /report/1
Extra credit for assigning a keyword (eg ‘ktx’) in the bookmark properties, allowing you to just type, eg, ‘ktx 1234’ in the Location bar to achieve the same.
This is a rework of a similar hack for a much older ticketer, PTS, which is amazingly still in use at one of my previous workplaces. You can gauge its age by the fact that it was ported to PHP3 and was pretty open to most injection attacks!
OS X does a good job of tracking what applications are where and what they do. I have little idea how this works, but if I move VLC.app into a new home then all the files that open in VLC still work. Great. I imagine it involves FSEvents and gobs of XML somewhere.
However the magic doesn’t touch everything. I was so bold as to move Dictionary.app from its default home in /Applications
into the Utilities
subdirectory. This broke the Ctrl-Cmd-D system-wide (Cocoa-wide?) lookup box. I don’t use it that much but that irked me nonetheless.
Console logs included this:
com.apple.launchd[335] (com.apple.DictionaryPanelAgent[490]): posix_spawn("/Applications/Dictionary.app/Contents/SharedSupport/DictionaryPanel.app/Contents/MacOS/DictionaryPanel", ...): No such file or directory
Clearly something thinks it knows where the nested DictionaryPanel application should live. This thing is a launchd
-managed process used by the keyboard shortcut, configured out of /System/Library/LaunchAgents/com.apple.DictionaryPanelAgent.plist
.
The hoojah to tweak this does of course involve XML:
$ launchctl unload /System/Library/LaunchAgents/com.apple.DictionaryPanelAgent.plist
$ plutil -convert xml1 -o - /System/Library/LaunchAgents/com.apple.DictionaryPanelAgent.plist | perl -pe 's|/Applications/Dictionary.app/|/Applications/Utilities/Dictionary.app/|;' > /var/tmp/com.apple.DictionaryPanelAgent.plist
$ plutil -lint /var/tmp/com.apple.DictionaryPanelAgent.plist /var/tmp/com.apple.DictionaryPanelAgent.plist: OK
$ plutil -convert binary1 /var/tmp/com.apple.DictionaryPanelAgent.plist
$ mkdir /Library/LaunchAgents-orig
$ sudo mv /System/Library/LaunchAgents/com.apple.DictionaryPanelAgent.plist /Library/LaunchAgents-orig && sudo cp /var/tmp/com.apple.DictionaryPanelAgent.plist /System/Library/LaunchAgents/com.apple.DictionaryPanelAgent.plist
$ launchctl load /System/Library/LaunchAgents/com.apple.DictionaryPanelAgent.plist
The last step was intended to make the change work this session, but it didn’t work. The DictionaryPanelAgent still loads but didn’t do anything until logged out and in again. I think this is something to do with launchctl
domains / sessiontype. Blunder factor is high.
This violates the “don’t frig with stuff in /System
” principle but I don’t know how else to solve it. The modified plist could go in /Library/LaunchAgents
of course, but I’d still need to disable the system version (with launchctl unload -w
) which is equally naughty. I think.
Recent articles
- Docker, SELinux, Consul, Registrator
(Wednesday, 04. 29. 2015 – No Comments) - ZFS performance on FreeBSD
(Tuesday, 09. 16. 2014 – No Comments) - Controlling Exim SMTP behaviour from Dovecot password data
(Wednesday, 09. 3. 2014 – No Comments) - Heartbleed OpenSSL vulnerability
(Tuesday, 04. 8. 2014 – No Comments)
Archives
- April 2015
- September 2014
- April 2014
- September 2013
- August 2013
- March 2013
- April 2012
- March 2012
- September 2011
- June 2011
- February 2011
- January 2011
- October 2010
- September 2010
- February 2010
- September 2009
- August 2009
- January 2009
- September 2008
- August 2008
- July 2008
- May 2008
- April 2008
- February 2008
- January 2008
- November 2007
- October 2007
- September 2007
- August 2007
- December 2006
- November 2006
- August 2006
- June 2006
- May 2006
- March 2006
- February 2006
- January 2006
- December 2005
- November 2005
- October 2005