Introduction next up previous
Next: Design of a scalable Up: Scalability of content-aware server Previous: Scalability of content-aware server

Introduction

Scalability remains a main requirement of a modern Web-based system that should be able to accommodate for user requests that augment in number and complexity. Unfortunately, upgrading just the number of servers does not represent a valid solution to the scalability problem, because this would move the bottleneck from the (back-end) server side to the front-end side. This risk is even more serious when we consider that new Web-based services require that the front-end component can catch from a request the largest set of information that exists at application level but not at TCP level. Content-aware features augment the front-end scalability issues of one-two orders of magnitude. Many solutions have appeared to improve the delivery of Web content [5,3,8,11] through locally distributed Web-server systems, briefly Web clusters. For a recent survey on the topic, see [6]. Basically, a Web cluster is a set of server machines that are interconnected through a high-speed LAN. The cluster is publicized through one site name and one virtual IP address that typically corresponds to the address of a dedicated front-end node. This important component, also called Web switch, is the main focus of this paper. It acts as an interface between the nodes of the cluster and the rest of the Internet, thus masking the distributed architecture of the site to the users and the clients. The Web switch receives all client requests and routes them to a Web server node through some centralized dispatching policy. We distinguish layer-4 from layer-7 Web switches. A layer-4 switch performs content-blind routing that is, it does not take into account any content information in the client request in performing assigning decisions. On the other hand, a layer-7 Web switch performs content-aware routing: it first establishes a complete TCP connection with clients, parses each request and assigns a Web server node according to the content. Content-aware routing allows a Web cluster to use sophisticated dispatching strategies, improves cache hit rates, permits content partitioning and gets a much larger set of user/client information. However, it tends not to be used as a front-end component of a popular Web-based information system because it has been demonstrated to be less efficient than a layer-4 Web switch. As an example, Aron et al. [3] show that the peak throughput achieved by a layer-7 switch is limited to 3500 conn/sec, while a software based layer-4 switch implemented on the same hardware is able to sustain a throughput up to 20000 conn/sec. To improve scalability of layer-7 architectures, alternative solutions for scalable Web-server systems, which combine content-blind and content-aware request functionality, have been proposed, e.g. [18,27]. The motivation of this paper comes from the observation that the absolute efficiency is not the right measure to judge the possibility of using a content-aware Web switch. Indeed, its performance should be related to the operational requirements that a Web switch should satisfy in a realistic multi-tier environment. This includes the inter-connection of the Web cluster to the Internet (for example, the large majority of Web clusters for economic reasons does not use more than T3-based connections, that have a peak bandwidth of 45 Mbps), the HTTP servers (with typical workload and modern hardware, they are not the system bottleneck anymore, unless they have to manage secure transmissions), and the back-end servers (that can easily become the system bottleneck, when the dynamic requests are computationally expensive). Moreover, when the classes of services provided by the Web site require peak throughputs higher than 40-50 Mbps, it is more likely that a different architecture should be considered, for example a system distributed over a geographical area. These motivations induced us to investigate whether the previous prejudices against layer-7 Web switches are still valid when one considers modern hardware and multi-tier architectures for content-aware distribution in cluster-based Web information systems. We describe the design and implementation of an efficient, content-aware Web switch (called ClubWeb-1w) that takes advantage of all possible features and optimizations of modern PC-based architecture. We demonstrate that careful design and implementation choices produce a Web switch with content-aware functionalities and very limited overheads. A careful analysis of its performance demonstrates that the proposed solution is extremely scalable, thus making a content-aware Web switch a viable solution to the performance requirements of the majority of popular Web sites based on cluster architectures. The most important contributions of the layer-7 Web switch are outlined below and discussed in the following sections. The implemented Web switch has been subject to a large variety of performance tests. All results confirm that the proposed layer-7 Web switch has a low overhead, even when the Web servers tend to be saturated. Moreover, we show that the Web switch scales pretty well across multiple server nodes. Finally, we also evaluate the performance of the Web cluster under realistic workload conditions. Again, we show that the switch is able to handle several thousands of connections per second without being the bottleneck of the whole system. We can conclude that the proposed one-way architecture is extremely scalable, thus making content-aware routing a viable solution to the requirements of the majority of network services provided by cluster-based architectures. The rest of this paper is organized as following. In Section 2, we describe main requirements, major issues and our solutions for an efficient design of the layer-7 one-way Web switch. Section 3 outlines two content-aware dispatching policies that we use for the experiments. Section 4 presents the implementation details, with major focuses on the techniques to obtain the best performance from single- and dual-based processor architectures. Section 5 contains the performance study. Section 6 concludes the paper with some final remarks.
next up previous
Next: Design of a scalable Up: Scalability of content-aware server Previous: Scalability of content-aware server
Mauro Andreolini 2003-03-13