Next: Implementation details Up: Scalability of content-aware server Previous: Design solutions for efficiency
7] is oriented to Web sites providing heterogeneous services with different computational impact on system resources. The set of static and dynamic services provided by the Web site is divided in classes, each one stressing the system components in different ways. The CAP algorithm works as follows. A list of circular pointers to servers is maintained (one for each service class). As soon as a client request is received at the switch, the parser module extracts the embedded URL and identifies the associated service class. Then, a round robin assignment on the given service class is performed, by using the appropriate pointer. The basic observation of CAP is that when the Web site provides heterogeneous services, each client request could stress a different Web system resource. Although the Web switch cannot estimate the service time of a static or dynamic request accurately, it can distinguish the class of the request from the URL and estimate its main impact on each Web system resource. A feasible classification for CAP is to consider disk bound, CPU bound, and network bound services, but other choices are possible depending on the content and services provided by the Web site. We also implemented the Locality-Aware Request Distribution [2,22], which tends to maximize cache hit rates of static resources. As soon as the Web switch receives an HTTP request, the parser module extracts the URL. Next, it checks whether the requested URL has already been handled by any Web server node. If this is the case, the request is forwarded to that node, unless it is overloaded. To avoid potentially unfair assignments, the server load is estimated through a centralized load monitor that counts the number of active connections for a given request class (static, dynamic). A Web server is considered overloaded if the number of opened connections exceeds a given threshold. If the chosen server is overloaded, the least loaded node is chosen. If the URL has not yet been assigned to a Web server, the least loaded node is chosen as well. The rationale behind LARD is that assigning the same Web object to the same Web server, the requested object is more likely to be found into the disk cache of the server node.
Next: Implementation details Up: Scalability of content-aware server Previous: Design solutions for efficiency Mauro Andreolini 2003-03-13