1 Highly Available Internet Services
jamesp — Wed, 08/11/2010 - 21:18
As the trickling down eventually allowed enough hardware, I wanted to move from the scattered install of public Internet services to a highly available cluster. I wanted certain things to almost always be available, particularly if they were public. I investigated what options for HA were available (on Linux of course) and decided what things I wanted to commit to the cluster.
The backend storage system I chose was DRBD. I chose this over a similar option of setting up a SAN (that I could expose data via iSCSI) due to the fact that it was also redundant in not only the data itself, but it was distributed over two servers. DRBD is dubbed as "network RAID". I planned this reorg much more thoroughly than my previous moments of "I could do that! On........ this quasi-server system here!", which had turned out to bite me more than once. Therefore, I had expansion in mind for this configuration. For expansion reasons, I decided to place my DRBD devices on top of LVM logical volumes. This had a couple advantages. First, by giving each service that had a volatile data directory it's own LV+DRBD space, the dependent service could be moved independently of others, and could also be resized independently of the others if need be. There is likely a disadvantage of all the layers involved, but I haven't done any testing and it isn't noticeable enough to cause latency (that I've noticed), except when you need to sync the shares. Without getting into complicated details that the DRBD folks have covered in pretty good online documentation, DRBD's most simple configuration is limited to exactly two nodes. Like I said, though, there are ways to have multiple secondary nodes, but that was beyond my (current) needs.
Next, was clustering. I chose the well-supported Heartbeat clustering system. When I first implemented my cluster, I chose the more limited heartbeat-1.x system that basically bundled all the clustering into a couple binaries, configured with a couple text files. I've since migrated to the newer system that is more modular and gives you options on which type of cluster, and better monitoring of the services and more complex configuration (like the >2 DRBD node systems).
Now, I had arrived at the question of which OS to place this fun stuff on. I tried to remove the distro religion out of the equation and just weighed pros and cons of distributions I felt would be a good so-called enterprise or server distribution. My choices were RHEL (eliminated immediately because I'm not rich), CentOS, Ubuntu Server, Debian, Gentoo. I decided that for these servers, even in a cluster with redundancy, I wanted minimal downtime. Of paramount importance was how quickly official packages are updated to address security issues, and to a lesser extent, new features. I wanted these servers to be low-maintenance once they were up. So, I eliminated Gentoo mostly because of the library-hell you can end up in after running updates. I had been using Gentoo for some of the web servers before and liked their system for adding web apps to virtual host directories. It would be possible to easily handle multiple versions of the same webapp, if you wanted to hold older, working virtual host at a working configuration and try out a new version of the software in a different virtual host. In the end, it came down to CentOS vs Ubuntu Server. Ubuntu Server won out over Debian in that Ubuntu has a more regular release cycle, complete with "long term support" (LTS) release that will receive security updates for five years, during which you have the option to update to one of the releases that happen every six months. At the five year mark, you can then safely upgrade to the then-current LTS. CentOS is a binary equivalent to RedHat Enterprise Linux (RHEL), as they actually take the GPL'd source code released by Red Hat, remove the trademarks, then compile into CentOS. There are also some additions (eg. kernel with more options). Between the two distributions, both seemed aggresive about security updates. I decided that as I was already more familiar with CentOS than Ubuntu Server (at the time), I chose that option. I should note that in 2010, I ran another mini-comparison of CentOS and Ubuntu Server and decided to switch to Ubuntu Server. I found that the feature-set in CentOS-5 was behind that of the 10.04 LTS of Ubuntu, with respect to the services in play in this cluster.
Finally, these are the choices I made for the services I would place in the cluster:
- Web Service: I would place all of my domains (jamespurl.com, jamespurl.org, mzrfzr.com, aliciapurl.org, baby.jamespurl.org, wiki.jamespurl.com and familybanks.org) into the cluster, so that they were highly available. Of course I used Apache, with standard plugins to support the CMSes I used for the sites on the domains: Drupal, Gallery, Wordpress, MediaWiki and NanoBlogger. One of the differences I found in 2010 between CentOS and Ubuntu Server was it included official packages for all of the above. Gentoo also supported them, in its virtual host system. I previously had baby.jamespurl.org and familybanks.org on different versions of drupal+gallery because I wanted to leave the older familybanks site alone and working. I kept them separate for some time, even after moving into the cluster, because I had to hand administer the web applications in CentOS. They've since been merged into proper multi-site configurations (where appropriate). A major advantage in version differences appeared in this service. I had been using an unsupported (on CentOS) apache plugin mod_gnutls to use SNI, essentially allowing me to do virtual hosts over SSL and present the correct SSL certificate. This functionality made it into the version of Apache+mod_ssl that was available for Ubuntu Server, so I got to discard one of the packages I had to maintain manually in CentOS.
- Mail Service: This system will be described on its own page.
- MySQL: This is the network-wide production MySQL server, for all uses in my LAN, except for the database for MythTV and development use. I initially had MythTV included, but it is an always-on service that doesn't handle the occasional interruption when switching of nodes well if it is in the midst of recording. MythTV's normal operation also added quite a load to the server with MySQL active, so it was best to switch into a more dedicated server.
- LDAP: This service includes a directory that can optionally be used for Linux to authenticate against. Originally, this service also included a samba PDC, but that has since been moved into the Intranet.
- NTP: This service is one that did not require shared data. It was just configured identically on both nodes. Whichever node wasn't active would periodically run ntpdate against the one that was.
- DNS: I used the standard BIND9 server on both nodes in the cluster. One is the master, and there are a total of 3 slaves to that master in the network. Three of the four are configured as stand-alone caching servers. The fourth is backed by OpenDNS.