|
Global Server Load Balancing takes the functionality
that network-based local load balancing offers and extends
it across the Internet.
Allan Liska special to HostingTech
| aliska@hostingtech.com
Security and disaster recovery have become the new catch phrases
in the hosting industry, and it is not just enterprise customers
who are interested. Customers of all sizes want to know what
will happen if their server fails, or a datacenter experiences
a complete network failure. One of the answers to this problem
is global server load balancing (GSLB). GSLB is a relatively
new service that has been implemented by major load-balancing
vendors, such as Cisco, Nortel, F5, Foundry, Extreme, and Radware.
GSLB allows a website, or websites, to be served from geographically
diverse datacenters. This provides customers with a disaster
recovery solution, and provides an additional revenue stream
for hosting providers.
GSLB is an enhancement to local load balancing; it takes the
functionality that network-based local load balancing offers
and extends it across the Internet. In order to understand how
GSLB works, it is important to comprehend the basics of local
load balancing.
Balancing act
According to the Computer History Museum, Cisco introduced Local Director, the first network-based load balancer, in 1996. A network-based load balancer sits on the network, separate from the server, and performs health checks on each of the servers. The IP (Internet Protocol) address of the website, known as a virtual IP or VIP, is bound to the load balancer, which forwards the packets to one of the available Web servers, which are called the real servers, based on customer-specified algorithms.
Load balancing has become a popular add-on for many hosting companies.
It provides an excellent level of redundancy within the datacenter,
and because most load balancers are network-based devices, they
are capable of providing enhanced security services, such as denial
of service attack prevention and protection against worms like
Nimda or Code Red.
Depending on your vendor, GSLB can be configured in multiple ways.
GSLB can be handled by the same devices that do the local load
balancing, or it can be handed off to a third device. GSLB can
also be configured using DNS (Domain Name Service), BGP (Border
Gateway Protocol), or through a process of packet encapsulation,
but DNS is the most common method.
The GSLB process begins by configuring local load balancing in
each datacenter. When local load balancing has been successfully
configured and tested, there should be a separate VIP for each
datacenter. The VIP creation stage is a phase where the GSLB process
begins to diverge and become more complicated, depending on the
particular solution.
In a basic GSLB configuration, where both local load balancing
and GSLB are performed on the same box, the next step is to add
the VIPs from each datacenter to the remote load balancer as real
addresses. Each load balancer can now distribute traffic between
the local servers and the remote server, which the load balancer
views as a local server.
Traffic between datacenters
The big question is, of course, how do you distribute traffic
between the two datacenters? The answer involves tricks with
DNS. In a local load-balancing situation, the process is simple:
Point the A record for your domain to the IP address of the
VIP. If the domain is www.example.com,
and the VIP is 10.10.0.1, you would make an entry in the zone
file mapping www.example.com
to the VIP:
www IN A 10.10.0.1
In a GSLB situation, a slightly different approach is required.
A VIP for the website exists on the load balancer for each datacenter
in the GSLB configuration; however, loading the VIPs into the
zone file would not work in this situation. If one of the datacenters
were to become unavailable, the DNS server would not know and
would continue to send traffic to the unavailable datacenter.
Instead, traffic should only be directed to available datacenters.
Distributing traffic to available datacenters is accomplished
by delegating the www records to the load balancers in the primary
zone file. Again, using example.com as the domain, the first step
is to create a CNAME record in the example.com zone file:
www IN CNAME www.gslb.example.com
Each load balancer should have an administrative IP address; the
next step is to create A records for those administrative IP addresses.
For example, if the administrative IP address for the load balancer
in Datacenter 1 is 10.10.200.1 and the administrative IP address
for the load balancer in Datacenter 2 is 192.168.200.1, the following
entries would be created in the example.com zone file:
lbdc1 IN A 10.10.200.1
lbdc2 IN A 192.168.200.1
The next step is to direct DNS requests for www.example.com to
the two load balancers, where the DNS server on the load balancer
will process them. To do this, create name server records for
gslb.example.com pointing to the two load balancers:
gslb IN A lbdc1.example.com.
IN A lbdc2.example.com.
Now when someone requests www.example.com, they will go to their
caching name server, which makes the request for information from
the authoritative name servers for example.com. The authoritative
name servers for example.com respond by saying that www.example.com
is really www.gslb.example.com, and to find out information about
www.gslb.example.com caching, the name server needs to query either
lbdc1.example.com or lbdc2.example.com. A name server record would
be added for every datacenter participating in GSLB for example.com.
All backed up
GSLB works because there are multiple levels of redundancy. The first level of redundancy is that the authoritative name servers are set up on different networks, in different locations. The second level is that the load balancers are in multiple datacenters and should be deployed in pairs. The third layer is that there are multiple servers in each datacenter. Even if there are several simultaneous equipment failures within a Web infrastructure, there is a good chance a GSLB-enabled site will remain available.
If one of the authoritative name servers fails, the caching name
server will get information from the second. If a datacenter becomes
unavailable, the caching name server has the second datacenter
from which to pull information. Even if an authoritative name
server fails, and one of the datacenters is unavailable, there
is still a path to the website.
To increase availability, most GSLB devices set the TTL (Time
To Live) for the domain gslb.example.com to one second. This forces
the caching name server to continuously request updated information
from the load balancers (assuming the administrator of the caching
name server honors one-second TTLs). Because the caching name
server is constantly refreshing information about example.com,
if one of the datacenters becomes unavailable, site visitors are
seamlessly transitioned to the other datacenter.
Case study
One company that has had a lot of success with this approach
to load balancing is MSNBC (www.msnbc.com).
MSNBC implemented a GSLB solution by F5 Networks (www.F5.com)
prior to the 2002 Winter Olympics. The F5 solution involves
two boxes: the F5 Big IP Controller handles the local load balancing,
while the F5 3-DNS Controller manages the GSLB portion of availability.
When MSNBC received the contract to host the website for the Olympics,
they added a second datacenter and started looking for ways to
distribute traffic between the two. After evaluating several vendors,
MSNBC decided on F5. According to Mike Corrigan, director of technology
for MSNBC, F5 helped integrate more than 50 Windows 2000 DataCenter
and Advanced Servers into a GSLB solution.
During the Olympics, the number of users on the MSNBC website
went from about 3.5 million per day to 8.8 million, with 75 million
page views daily. Traffic to the site grew exponentially as well.
Prior to the Olympics, the traffic averaged 150 Mbps. During the
Olympics, traffic exploded to over 1.3 Gbps. During this time,
the F5 solution helped keep the availability of the MSNBC website
to 99.98 percent at the datacenters and 99.8 percent at the edge.
According to Corrigan, "MSNBC has been happy with the selection
of F5. The combination of local and global load balancing has
strengthened the overall availability and survivability of the
MSNBC site."
GSLB is not without its problems. Alex Samonte, chief engineer
of Metromedia Fiber Networks (www.mfn.com),
says, "GSLB is an easy concept to understand, but it doesn't
always quite work the way you would imagine it. Caching DNS
servers in odd locations, asymmetric routing, route flapping,
and many other things will cause GSLB not to work as you expect."
This point is one of the most common complaints about GSLB. On
paper, it sounds like it should provide 100 percent availability.
Unfortunately, there are so many unknown factors and uncontrollable
aspects of the network that offering 100 percent availability
is simply not possible.
Samonte adds, "The main problem encountered is mismatched expectations. Most
customers are sold GSLB as the end-all solution to load balancing
and multiple sites."
|