Improving Performance of Shared Web Hosting Service on a Web Server Cluster
Hewlett Packard Labs
1501 Page Mill Road
Palo Alto, CA 94303, USA
Traditional load balancing solutions ( RRDNS, Local Director from Cisco, BigIP from F5 Labs, ACEdirector from Alteon, SecureWay Network Dispatcher IBM, etc.) try to distribute requests uniformly across all nodes regardless of the content. This interferes with efficient use of RAM in the cluster. The popular files tend to occupy RAM space in all the nodes. This redundant replication of content across the RAM of all the nodes leaves much less RAM available for the rest of the content, leading to a worse overall system performance.
A better approach would be to partition the content among the machines thus avoiding replication of the documents in the RAMs. However, static partitioning will inevitably lead to an inefficient and inflexible solution, since the access patterns tend to vary over time, and static partitioning does not accommodate for this.
The observations above have led to the design of ``locality aware'' balancing strategies [LARD] which aim to avoid unnecessary document replication across the RAM's of the nodes to improve the overall performance of the system. These are also known as content based routing strategies.
In this work, we promote a new scalable, ``locality aware'' solution FLEX [C99] for load balancing and management of an efficient Web hosting service. The goal of FLEX is to assign sites to the nodes in the cluster to achieve both load balancing and efficient memory usage. We use the working set sizes and access rates of sites as a metric for judging their memory and load requirements.
Let there be a total of S sites hosted on a cluster of N web servers. For each web site s, using the information which can be extracted from the web server access logs of the site s, we build the initial ``site profile'' SP_s by evaluating the following characteristics:
- A(s) - the access rate to the content of a site s (in bytes transferred during the observed period P);
- W(s) - the combined size of all the accessed files of site s (in bytes during the observed period P, so-called ``working set'');
Thus, the FLEX solution consists of the following steps: 1) web sites' log collection; 2) analysis of web sites' profiles based on working sets and access rates evaluation; 3) execution of the algorithm "Closest" resulting in allocation of web sites to servers; 4) submission of new DNS configuration files with corresponding assignment of web sites to servers.
This solution is flexible and easy to manage. Tuning can be done on a daily or weekly basis. If server logs analysis shows enough changes in the working sets and access rates, the "Closest" algorithm finds a better partitioning of the sites to the nodes in the cluster (with minimal number of sites changed routing), and new DNS configuration files are generated. Latest version of BIND 8.1.2 supports Dynamic Update standard described in RFC 2136. This allows authorized agents to update zone data by sending special update messages to add or delete resource records (without restarting DNS server). Once DNS server has updated its configuration tables, new requests are routed according to the new configuration files, and this leads to more efficient traffic balancing on the cluster. The entries from the old configuration tables can be cached by some servers and used for request routing without going to the primary DNS server. However, the cached entries are valid for a limited time dictated by the TTL ( time to live). Once the TTL expires, the primary DNS server is requested for updated information. During the TTL interval, both types of routing: old and a new one, can exist. This does not lead to any problems since all servers have accesses to the whole content and can satisfy any request. Such a self-monitoring solution helps in observing changing site traffic patterns and help to predict future trends and plan for them.
Using two case studies (based on real traces), we evaluate the potential benefits of the new solution. We compare the performance of FLEX against Round-Robin and Optimal strategy (which avoids all document replication and does perfect load balancing at per-request granularity). FLEX significantly outperforms Round-Robin (up to 130% in average server throughput), getting within 5%-15% of optimal performance achievable for those traces. Miss ratio is improved 2-6 times. To study the scalability issues of Round-Robin versus FLEX strategy, we performed simulations for four- and eight-node clusters. The ``speedup'' under Round-Robin strategy for an eight-node cluster is only due to doubled ``processing'' power. FLEX solution shows superliner speedup when the number of nodes is increased from four to eight because it takes advantage of both the doubled processing power and memory.
The main attractions of the FLEX approach are ease of deployment and an extremely attractive cost/performance tradeoff. This solution requires no special hardware support or protocol changes. There is no single front end routing component. Such a component can easily become a bottleneck, especially if content based routing requires it to do such things as tcp connection hand-offs etc. FLEX can be easily implemented on top of the current infrastructure used by Web hosting service providers.
Detailed description of FLEX and its simulation results can be found in [C99,CDP00]:
[C99] L. Cherkasova FLEX: Design and Management Strategy for Scalable Web Hosting Service. HP Laboratories Report No. HPL-1999-64R1, May, 1999.
[CDP00] L. Cherkasova, M. DeSouza, S. Ponnekanti: Performance Analysis of Scalable Web Hosting Service with FLEX: Two Case Studies. HP Laboratories Report No. HPL-2000-28, February, 2000.