Load balancing in general is a complicated process, but there’s some secret sauce in managing DNS along with multiple load balancers in the cloud. It requires that you draw from a few different sets of networking and “cloudy” concepts. In this second article in my best practices series (my first post covered how to use credentials within RightScale for storing sensitive or frequently used values), I’ll explain how to set up load balancers to build a fault-tolerant, highly available web application in the cloud.
Here’s what you’ll need:
- Multiple A records for a host name in the DNS service of your choice
- Multiple load balancers to protect against failure
Before I explain how the two work together, let’s check out how each of them works individually.
Multiple A Records for a Host Name
A records translate friendly DNS names to an IP address. For example, when you type rightscale.com in your browser, behind the scenes your computer is asking a DNS server to translate the name to an address.
I’m working from a Mac and the process is a little different for Windows-based machines, so check out more on nslookup for *nix and Windows, respectively. I’m also using one of Google’s public DNS servers to perform my lookups. Check out the request below and note that when I query DNS, I’m getting a single address back for my test domain of dnsdemo.cloudlord.com:
My test domain has one A record associated and it resolves to the IP noted.
Let’s check out a more complicated example — Google.com:
Note that as shown in Figure 3 below, Google returned six addresses (at the time I queried it) because Google has six A Records registered to serve its main domain. When I ran the same nslookup query again, the IP addresses were returned in a different order. This is commonly referred to as “DNS load balancing.”
Figure 4 below shows DNS load balancing in action using dnstest.cloudlord.com and a test file that indicates which server is being served up. For this example, I set up my dnstest.cloudlord.com domain with two A records. Note that this first request has one attempt and the content reads that “this is server 1.”
Next, I terminated the first server on the next request (to force a failed state), and the results are shown in Figure 5 below. Note that there’s a timeout on the first IP, and then the second request goes through without any issue. You’ll also notice that it’s returning a response of “this is server 2.”
Figure 5 – Curl request to test domain with primary A record in failed state (note timeout and new IP)
The order in which IP addresses are returned varies by the DNS server and provider used but often follows a round-robin or geographically specific algorithm.
The idea here is that different clients will get different ordered lists of IP addresses corresponding to your domain name. This has the effect of distributing requests across the group of IPs in a specific manner. If an IP address is does not respond in an appropriate amount of time, the client will time out on that request and move on to the next IP address until the list is exhausted or it finds a connection that’s valid. Although it’s not an exhaustive list, most modern browsers, along with curl as shown above in Figure X, follow this retry process.
There are a few things to remember though:
- DNS failover doesn’t provide any additional features such as “sticky sessions” for your application.
- Upstream DNS caching is unpredictable — client DNS providers may or may not respect your TTL settings.
- This isn’t a replacement for TCP load balancing because it’s not terribly precise based on the upstream DNS caching process noted above.
Multiple Load Balancers for Redundancy and Scalability
With multiple IP addresses routing to your deployment, each of these addresses can terminate at a load balancer that serves your back-end application (see Figure 6 below). Doing this, you’ll be able to present multiple endpoints to the public to serve your application (I’ll get back to why this is important in a minute).
In Figure 7 below, I go a step further and illustrate how connectivity to the application layer can be set up from multiple TCP load balancers. This allows you to have multiple incoming connections each serving up the same content, providing a redundant load balancing layer as well as a redundant application layer.
Figure 7 – Connection diagram for multiple load balancers connecting redundantly to the same application server tier
DNS Load Balancing: Bringing It All Together
The end result is that by using DNS load balancing, you can achieve a fairly rough balance of traffic between multiple TCP load balancers, which can manage applying load to your application servers at a more granular level:
Figure 8 – Full incoming connection diagram showing multiple load balancers with their own IP address
This is a great way to protect against failure and increase overall throughput, giving you the ability to scale for high availability and high performance. For more information on metrics related to configuration and throughput on HAProxy in the cloud, check out this white paper, Load Balancing in the Cloud: Tools, Tips and Techniques.
Setting up DNS load balancing can be a bit of a hassle, but the Load Balancer with HAProxy ServerTemplate, along with scripts for application servers to attach to load balancers, simplifies the process. The RightScale ServerTemplateTM and scripts use a tag-based, managed solution that will keep your HAProxy config files synchronized and that will automate the deployment, registration, and detach process for all servers involved. To use the ServerTemplate for setting up DNS load balancing, sign up for a free RightScale trial.