BLT: Bi-Layer Tracing of HTTP and TCP/IP1
Universität des Saarlandes, Saarbrücken, Germany
We describe BLT, a tool for extracting full HTTP level as well as TCP level traces via packet monitoring. This paper presents the software architecture that allows us to collect traces continuously, online, and at any point in the network. The software has been used to extract extensive traces within AT&T WorldNet since spring 1997 as well as at AT&T Labs-Research. BLT offers a much richer alternative to Web proxy logs, client logs, and Web server logs due to the accuracy of the timestamps, the level of details available by considering several protocol layers (TCP/IP and HTTP events), and its non intrusive way of gathering data. Traces gathered using BLT have provided the foundation of several Web performance studies[16,29,14,18,24,25].
To improve the performance of the network and the network protocol, it is important to characterize the dominant applications [7,12,13,18,29,32,33]. Only by utilizing data about all events initiated by the Web (including TCP and HTTP events) can one hope to understand the chain of performance problems that current Web users face. Due to the popularity of the Web it is crucial to understand how usage relates to the performance of the network, the servers, and the clients. Such comprehensive information is only available via packet monitoring. Unfortunately, extracting HTTP information from packet sniffer data is non-trivial due to the huge volume of data, the line speed of the monitored links, the need for continuous monitoring, the need to preserve privacy, and the need to be able to monitor at any point in the network. These needs translate into requirements for online processing and online extraction of the relevant data, the topic of this paper.
The software described in this paper runs on the PacketScope monitor developed by AT&T Labs . PacketScope is deployed at several different locations within AT&T WorldNet, a production IP network, and at AT&T Labs-Research. One PacketScope monitors T3 backbone links, another PacketScope may monitor traffic generated by a large set of modems on a FDDI ring or traffic on other FDDI rings, another PacketScope monitors traffic between AT&T Labs-Research and the Internet. First deployed in Spring 1997, the software has run without interruption for weeks at a time collecting and reconstructing detailed logs of millions of Web downloads with less than a worst case packet loss of 0.3%.
The rest of this paper is organized as follows. Section 2 discusses the advantages of packet sniffing and Section 3 outlines some of the difficulties of extracting HTTP data from packet traces. The overall software architecture is described in Section 4. Our solution (including the logfile format) is presented in Sections 5 - 7. In Section 8 we revisit some of the studies based upon data collected by BLT and point out how each study benefited from the data. Finally, Section 9 briefly summarizes some of the lessons learned.
There are many ways of gaining access to information about user accesses to the Web:
- from users running modified Web Browsers;
- from Web content provider logging information about which data is retrieved from their Web server;
- from Web proxies logging information about which data is requested by the users of the Web proxy;
- from the wire via packet monitoring.
While each of these methods has its advantages most have severe limitations regarding the detail of information that can be logged. Distributing modified Web browsers to a representative sample of consumers and having them agree to monitor their browsing behavior is problematic, especially since Microsoft Internet Explorer and Netscape's browser became more popular than Mosaic and Lynx. The source code to Microsoft Internet Explorer is not available and the source code to Netscape has just recently become available. Some studies such as Crovella et al.  clearly show the benefit of such data sources. Yet, to evaluate changes in Web client access patterns between 1995 and 1999  the same authors augmented a proxy instead of modifying the client.
While the logfiles from Web servers are extremely useful to scale the performance of the specific Web server they are not necessarily representative of the overall Web. The access pattern from users to specific files are heavily influenced by what content the Web server is offering [14,28]. Therefore a lot of Web server logs have to be analyzed in order to generalize to the overall Web. While possible [3,11,28] this is non trivial. Another aspect is that currently the standard log files generated by Web servers do not include sufficient detail regarding timing of all possible aspects of data retrieval.
Using the Web proxy for logging information can be suboptimal especially if either not all users are encouraged/forced to use the Web proxy, or it is impossible to instrument the Web proxy, or if insufficient detail is available in the logged information. The work of Mogul et al. [31,26] shows how useful Web proxy traces can be. One benefit of data from packet traces over proxy traces is very precise timestamps.
The information gathered from packet monitoring includes the full HTTP header information plus detailed timestamps of HTTP and TCP events. This may, e.g., include the timestamp for when a GET request was issued, when the corresponding HTTP response was sent, when the first/last data packet was sent, etc.. In addition full IP packet headers can be collected. The advantages of monitoring on the wire via packet sniffing include that this monitoring methodology is passive and therefore oblivious to the user. It does not impact the performance of the network. The amount of detail that can be gathered is sufficient to capture TCP and HTTP interactions. If desired and allowed BLT has the potential to collect the actual downloaded Web page (including results from CGI scripts). Having this detail available has enabled such studies as the effectiveness of delta-encoding and compression , the rate of change of Web pages , Web cache coherency schemes , the benefit of Web caching for heterogeneous bandwidth environments [16,8], and the characterization of IP-flows for WWW traffic .
Three other projects have used packet level data to extract Web data. A group from IBM augmented their Web server logs with partial packet level data for the collection of the traces during the Olympics. This data allows TCP level performance characterization and analysis . Still, while it is possible to glean information about the access patterns within a site it is impossible to learn cross site effects. A group at Berkeley used a packet sniffer on the HomeIP network at the University of California at Berkeley . They wrote their own set of software to continuously extract HTTP information on top of the Internet Protocol Scanning Engine ISPE . Their user-level HTTP module is sitting on top of the TCP module and mainly logs HTTP level information. Since the main interest of the authors at the time of the sniffer code development was focused on HTTP traces they currently do not log full HTTP headers nor the full set of timestamp for TCP events. If studying things like Web caching and the burstiness of the arrival pattern of Web requests [16,8,17,20] these missing details can lead to misleading predictions. A group at Virginia Tech have developed HTTPDump  to extract HTTP headers from tcpdump traces . The performance of their general tool is not sufficient to collect continuous traces on an 10Mbit/second Ethernet. The simpler PERL version  that only parses the first packet of the first HTTP request/response on a TCP connection promises good performance but is severely limited in its generality.
Other applications of accessing Web data from packet level data are Layer 5switching  and content-based request distribution schemas . Both redirect HTTP request towards different servers based upon the content of the HTTP request by either moving the TCP state or rewriting the TCP sequence numbers or a combination of the two methods. Layer 5 switching is easier than layer 5 information extraction because the switch is in the data path and it is close to the Web server. Therefore it can throttle the server and it should see both sides of the packet stream.
The work most closely related to ours in terms of being able to collect 24x7- 24 hours, 7 days a week, around the clock passive measurements at key locations in the network is Windmill . Windmill offers an extensible experimental platform in which application modules can process the subsets of the packet stream that they need. Our software design is driven by the desire to collect an extensive TCP/IP and HTTP level traces. As such we have identified the key events to log for Web performance studies, and since more than 70% of the traffic is Web traffic, we have designed the system for maximal throughput and minimal interaction between data collection and data processing.
Adding support for Web trace extraction goes well beyond the basic idea of packet sniffers like tcpdump  that any packet can be processed in isolation. Indeed, the extraction software has to almost run a TCP and a full HTTP stack in order to demultiplex the packets and extract the content. The software has to go from packets to TCP connections, from TCP connections to individual HTTP transactions (there may be more than one), from individual HTTP transactions to HTTP requests and HTTP responses and the transfered data. While this is hard enough to implement correctly on an end system it is even harder in this case because the software is incapable of throttling the end system, cannot make any assumptions about the compliance of either the clients nor the servers with the TCP and HTTP specifications, nor may it see both halves of the TCP connection. Even worse due to packet by packet load balancing it may only see every other packet for some, currently small fraction, of the transfers. On the other hand the software has the advantage that it doesn't have to be perfect. It is our desire to gather continuous traces without downtime on a high speed transmission medium such as FDDI or multiple T3's with capacities greater than 100 Mbit/second. In such an setting it is almost impossible to not lose some small fraction of the HTTP transactions due to packet losses at the sniffer. It is possible to keep packet losses small (e.g., by running the sniffer at higher priority) but it is impossible to guarantee that no packet will ever be lost. Therefore the resulting trace data should only be used for such analysis that are statistically robust against losing a very small fraction of the transactions.
To get a better flavor for the problems that need to be addressed consider the following subproblems: assumptions about how Web pages and their meta information is fragmented into packets, assumptions about how TCP connections are used by HTTP, demultiplexing (including reordering and loss) of TCP packets to HTTP transactions, and sanity checking of the extracted information.
Web pages and their meta information are often fragmented into TCP packets in an unexpected fashion:
- A single HTTP request or response header can be almost arbitrarily long. Sizes of greater than 5,000 bytes are not too uncommon (as such, they are certainly longer than the typical 1,500 bytes per packet).
- A single HTTP header (even if less than 500 bytes) can easily be split among multiple packets. (It is not too uncommon to see each line of an HTTP request or response being transmitted in a single packet.)
- Retransmitted data may have a completely different fragmentation, e.g., the original HTTP request and first 300 bytes of data might have been transmitted in 5 separate packets; the end-system thinking that the packets got lost may retransmit all of the data in a single packet.
HTTP uses TCP as its underlying transport protocol leading to the following issues:
- A TCP connection may be terminated at any point.
- On abort some HTTP clients will close the connection via RST others will send a FIN.
- Even in HTTP 1.0 one TCP connection can be used to transfer multiple HTTP requests and responses.
- Within a single TCP connection, determining when a transfer has completed and the new meta information starts, is non trivial because there is nothing marking the data transfer as completed.
- HTTP header information does not always start at the beginning of a packet.
- Even an HTTP GET request may contain data.
- Multiple HTTP requests can be pipelined on a TCP connection.
Demultiplexing of the TCP packets into HTTP transactions implies dealing with lost packets, retransmitted packets, and reordered packets:
- The packet sniffer may lose any packet (even the ones containing the TCP open connection or close connection events). Therefore using a TCP connection as the demultiplexing unit is problematic.
- The sniffer may lose the packet containing the HTTP header or response information and therefore has to ignore the data associated with the request.
- Even packets containing the newline separating the HTTP response from the HTTP data can get lost, making it tricky to decide when the HTTP meta data ends and when the real data begins.
- The packet containing the Web page data may not always arrive at the sniffer location before the packet containing the HTTP response.
- A packet containing the HTTP request may be received after the packet for the HTTP response for the HTTP request is received. (This is possible since the source might have thought that the packet got lost and retransmitted the packet.)
While it seems easy to debug HTTP extraction software, not all bugs may be bugs:
- The HTTP response content length is not always accurate.
- The HTTP header field can contain wrong/misleading information, e.g., bogus last modified times.
- Requests to the same Web page do not need to yield the same results (some sites return customized data (e.g., depending on the browser type)).
All of the above indicate that one needs a sophisticated tool to extract HTTP information from packet level data and that just inspecting the first x bytes of each Web TCP connection is insufficient.
The hardware and software design for the monitoring system was driven by the desire to gather continuous traces without downtime on a high speed transmission medium. If there is a collection machine that is capable of capturing packet on a medium than it should be possible to run BLT. The software should be deployable even on backbone links. Due to the asymmetric routing, common in todays Internet, backbone links may only see packets of one direction of a TCP connections.
Figure 1: AT&T PacketScope architecture
Hardware design: The hardware of the AT&T Packetscope  consists of standard hardware components, a Dec Alpha 500 Mhz Workstation with a 8 Gigabyte Raid disk array and a 7 tape DLT tape robot. For more details on the hardware architecture see Figure 1. Several security precaution have been taken, including using no IP addresses and using read only device drivers. The Dec Alpha platform was chosen because of the kernel performance optimizations to support packet sniffing by Mogul and Ramakrishnan .
Software design, online vs.\ offline extraction: Given that HTTP headers can easily be larger than 1500 bytes and will span multiple packets we had no choice but to collect full packet traces of the wire. At speeds of 100 Mbit/second this implies that the processing of the data into the log format has to be done on the monitoring machine itself. No current DLT tape technology can transfer data to tape at a rate anywhere close the 100 Mbit/second. Neither does the disk system allow storage of more than a few hours of data. Besides, processing the logs offline would introduce serious privacy concerns with respect to the data content of the packets. Since the Packetscope, due to the placement in the network, may only see packets either directed to the Web server or to the Web client, no matching of HTTP requests with their HTTP responses is done online. Rather, where possible, this is done offline.
Software design, partitioning of the software: Packet sniffing involves having the packets pass through at least some part of the protocol stack on the monitoring machine at interrupt level. At line rate even pure packet sniffing can already stress even such powerful machines as the DEC Alphas . The amount of processing per packet for HTTP header extraction is variable and potentially quite large. E.g., one can imagine collecting all packets of a TCP connection and extracting the HTTP information upon receiving a packet with a FIN flag. In this case the processing time for the packet with the FIN flag would be much larger than the processing time for any of the other packets of the same TCP connection. Only when receiving a packet with a FIN flag would the much more involved HTTP extraction be executed. Due to variable processing time we separated the processing priorities of the tasks - high priority for the packet sniffing; lower priorities for the packet extraction and any other software. To avoid interference between the HTTP extraction software and packet sniffing, the extraction software should avoid processing at interrupt level. Splitting the software into two stages introduces the need to pipeline the processing. We choose to use files as buffers between the collection and the processing stages.
Figure 2: Control flow of HTTP header extraction software.
We decompose the overall task into four components: packet sniffing, a control script, HTTP header extraction, and HTTP header matching. Figure 2 shows how the first three interact with each other.
software based on tcpdump  that will copy a fixed number of bytes from each packet to a file. Once it has processed some number of packets, this software will close the current file, move the file to a different directory, and open a new file. In addition all IP addresses are encrypted as they come of the wire before saving them to disk. This process runs at normal priority.
a perl  script that controls the pipeline. It monitors a directory and will start the HTTP header extraction software for each file that the packet sniffing software generates. Once the header extraction software is done it will copy the logfiles to tape and clean up the disk. Besides controlling the copying of files the control script also needs to monitor the tape usage, switch tapes on the tape robot, and allow for personnel at the PacketScope locations to change tape sets at any point in time.
software that will process files generated by tcpdump (containing packets with full data content) and extracts logfiles containing full HTTP request and response headers and relevant timestamp information, TCP timestamps and data summarizing the data portion of the HTTP requests/responses. In addition the software creates pure packet header tcpdump files for the observed traffic. The software extends tcpdump  to reconstruct HTTP sessions and is run niced to the maximum possible level.
offline post-processing software that will match HTTP request information with HTTP response information where applicable. The match is based upon either a match between sequence numbers and acknowledgment numbers or additional heuristics.
The benefit of building most of the software on top of tcpdump is that we can take full advantage of the filtering mechanism, and the built-in knowledge about the IP/TCP protocol stack. Using the filtering mechanism is especially useful if BLT is run in an environment where the capturing hardware is at its limits. In this case using a more restrictive filter may provide BLT with the cycles. (Note, that not all Web traffic is using port 80.) Adding the support for multiple files for the packet sniffing software is a trivial extension of tcpdump. Next, we give more detail on the HTTP header extraction (Section 5), the logfile format (Section 6), and the HTTP header matching software (Section 7).
The software is built along the following lines:
- Demultiplexing packets according to ``Web pages'', using ``IP-flows''
- Reordering packets according to TCP sequence numbers
- Eliminating duplicate packets (due to retransmissions)
- Identifying missing packets (due to loss in monitor, or due to packet per packet multiplexing)
- Extracting the HTTP protocol header information and the HTTP body part and timestamp information from the data content of the TCP packets
- Extracting relevant TCP timestamp information
- Computing the HTTP protocol information and summarizing information about the HTTP data part, such as the length of the data content, and starting and ending sequence numbers.
- Unless the policy of the collection location allows the storage of the HTTP data part, it is discarded immediately and should never leave the monitoring machine.
To not impede the packet sniffing effort it is crucial to avoid unnecessary file I/O and therefore the software should stay memory resident. This makes it impossible to follow the above recipe step by step while continuously monitoring packets. Alone the memory requirements for storing about 200,000 packets each of size 1,500 bytes exceeds the memory of our monitoring machine. Therefore it is necessary to split the steps outline above into substages. Whenever all packets have been received for one HTTP transaction its information is extracted. Unfortunately, a single transaction can involve thousands of packets; therefore even this step has to be staged. Whenever a sufficient number of packets is received for one HTTP transaction its partial information is extracted. The clean-up step controls this staging.
Figure 3: HTTP Header extraction
Instead of using TCP connections we use IP-flows [10,18] to demultiplex packets. This accounts for the possibility of losing packets with TCP flags, monitoring only packets from one side of a TCP connection, and fragmentation of Web pages and their meta information. An IP-flow is a set of packets that are close in time and that have the same IP addresses and the same TCP port numbers of both the source and the destination. Our definition of close in time is somewhat looser than the definition used in  by using a 10 minute (a compile time constant) timeout value. For the most part, all packets in an IP-flow correspond to a single unidirectional TCP connection, and all packets in a single unidirectional TCP connection correspond to a single flow. The main data structure is a per flow list of packets and a list of partial information extracted from this flow. The desire is to append any new incoming packet to the correct list of packets and then, at the appropriate time intervals, extract the HTTP information. Figure 3 shows a schematic of how the various steps are done on the per flow data structure.
If every TCP connection would contain exactly one HTTP transaction, the key for indexing the per flow data structure would correspond to a single HTTP request or response. Unfortunately the use of persistent connections  is common enough, even in HTTP/1.0, that the key is not sufficient. Instead, the off-line matching of HTTP requests and HTTP responses is done by matching the sequence numbers with the acknowledgments. Indeed there may not be a match because not all HTTP requests will generate a HTTP response message. Given the complications of finding the match between HTTP requests and HTTP responses and the fact that it is separable from the information extraction, this step is done during off-line post-processing.
The extraction stage performs the steps outlined at the beginning of the section on a single, sufficiently long, list of packets. This means that the packets are ordered according to their sequence numbers. (The sorting is efficient since most packets are only slightly or not at all out of order. The chosen sorting routine handles sequence number overflows correctly.) Next, either all or an initial subset of the packets will be used to extract the TCP/HTTP timing and HTTP request/response information. If no packet has been received for an IP-flow within the last 10 minutes all packets will be processed. If the list contains more than 300 packets, the first say 200packets (both numbers are compile time constants) within the list are processed. By processing only about 2/3 of the packet list current gaps in the later part of the list are likely to be filled by packets that are still in transit. (The TCP window size is limited.) In contrast processing all packets would lead to many more missing packets and a much more incomplete logfile.
Due to persistent connections an HTTP header can start at any point during an IP-flow. The HTTP header information is found by looking for one of the following patterns. (For simplicity we are denoting the patterns using perl the number of tcp connection notation even though the implementation is in C.)
Here \n may be the UNIX or the MSDOS newline character. The end of the HTTP header is found by looking for two CRLF or when running out of a preset limit of 50000 bytes. This limit is necessary in case the packet containing the newline is lost. Whenever a gap in the sequence numbers is discovered, the flag GAP is set in the logfile and the data should be disregarded in further analysis. In general the packet loss is well below 0.3% and very few HTTP transactions are affected by losses.
To support partial processing of packets, our software keeps a state machine for any active flow. The state machine records
- the number of TCP connections using this flowid since the flowid was started
- the number of HTTP requests within the TCP connection
- timestamp of the packet with the first sequence number
- file containing the extracted data content (only at appropriate trace locations)
- flag indicating partial HTTP data extraction yes/no
- size of partially extracted data content
- the next sequence number
It is necessary to record the number of TCP connections that are using a flowid although it is very unlikely that the same customer may reuse a given flowid for a different TCP connection. Still in modem environments this is more likely since a given IP address can be reassigned to different customers and those may use the same port number to visit for example the same Web server. (Port numbers on newly rebooted machines usually start at a fixed port number 1024.)
Since persistent connections can use a given TCP connection for more than one HTTP request the index of an HTTP request within a TCP connection leads to a useful heuristics for the matching of HTTP requests and responses. The next field keeps track of information extracted from each HTTP body, the timestamp of the first packet containing data and the filename containing the partially reassembled body (where appropriate). The remaining fields store the flow state once a partial list of packets has been processed. The state consists of a bit to indicate if the software is currently extracting an HTTP body, the size of the body extracted so far, and the next sequence number. This state is sufficient under the assumption that partial processing never ends within an HTTP header. To keep this invariant, partial processing is continued beyond the first 200 packets of the current list of packets if an HTTP header is reconstructed. (While HTTP headers can be spread among many packets, we have not yet found an HTTP header spread among more than 50 packets.) As soon as some complete information, e.g., about a TCP event or about an HTTP header, has been retrieved from the packets this information is written to the logfile.
This stage is used to age flows. It triggers the extraction stages for an IP-flow if either the list of its packets is larger than 300 packets or if no packet has been received for this flow for the timeout of 10 minutes. Finding the time intervals for scheduling the clean-up stage is complicated by the desire to balance processing overheads versus memory. Currently, the clean-up stage is executed after processing 50,000 packets (another compile-time constant).
Finding an appropriate logfile format for BLT is crucial since the online processing paradigm makes it impossible to go back in time and augment the logfile with additional information. The other motivation for a detailed discussion of the logfile format is that it shows the breath of information that is available by tracing across multiple protocol layers. Our choice of logfile format for BLT was guided by the following concerns:
We want to distill all essential information from the traces, this includes full HTTP headers information, full IP/TCP header information, and TCP connection information.
Most Web performance studies do not need the full IP/TCP header information but they gain significantly from accurate timestamp information about HTTP events.
Web performance studies also gain from knowledge of lower layer events such as, e.g., TCP connection establishment timestamps. These timestamps have a natural equivalent in the application and supply crucial timing information.
The HTTP header information may contain fields that reveal information about the source or destination of the HTTP request. Privacy concerns demand that one separates this information from IP address information.
To meet our goals the HTTP header extraction software splits the information into three different files:
- Packet header: IP/TCP packet headers of all observed traffic stored in tcpdump file format.
- Flow:information about every unidirectional IP flow .
- HTTP/TCP:TCP events and HTTP events; the logfile includes the raw text of every HTTP request/response.
In terms of size the packet header logs are by far the largest. The next smaller ones are the HTTP/TCP logs while the per flow files are the smallest ones. By separating IP/TCP packet headers from HTTP level information we address the conciseness problem. Yet, by keeping strategic TCP events and HTTP events together with the HTTP header information we ensure a level of completeness sufficient for most Web performance studies. In case this level of detail is insufficient, the packet header information is structured to allow an easy join of the datasets. This level of detail is sometimes necessary to verify assumptions and simplifications made using timing information available at the higher levels. For example, we used the packet level information to estimate the impact of slow start on the time savings yielded by first applying delta encoding or compression before transferring the data . By keeping TCP events and HTTP events in the same file it becomes natural to consider cross protocol effects. We keep full HTTP header information since the HTTP protocol is still under development, subject to customization, e.g., cookies, and subject to use by other applications as their transport protocol. (Ignoring any such header can potential lead to misleading number, e.g., ignoring cookies may lead to much higher cache hit rates for Web caching ). For privacy reasons it is necessary to separate the per flow information from the HTTP header information since the per flow information contains encrypted IP addresses and the HTTP header may, e.g., contain the hostname of the contacted host. (There is not always a one-to-one correspondence between IP address and hostname.) By keeping the information separate we lessen the impact.
Table 1: Format of the event and header logfile
Table 2: Content of the HTTP header logfile
In general the file formats where chosen to facilitate easy processing by scripting languages such as awk and perl. The per flow files contain the encrypted source and destination IP addresses and port numbers, each entry contains a unique identifier for each flow that is used to cross reference with the HTTP/TCP logfile. We need this cross reference ability in order to match HTTP requests with HTTP responses.
The file format of the HTTP/TCP file is more complicated in part because it needs to record different kinds of information such as TCP events, HTTP events, and HTTP request/response headers. But more importantly the reconstruction procedure may create information for a particular request at any time. We can identify the parts that are associated with the same HTTP transfer by taking advantage of the per flow state. One can identify all TCP events associated with the TCP connection that is used by a particular HTTP transfer by locating all of the TCP events with the same flow identifier and flow count. A TCP connection identified via flow identifier and flow count is a persistent connection if it has more than one HTTP request with the same index.
The file format of the HTTP/TCP file consists of two parts. The first part consist of the basic flow information including flow identifier, number of TCP connections seen on this flow identifier and the number of HTTP requests seen on this flow identifier (see Table 1). Each of these numbers is initialized to 0 in the beginning. The second part consists of a string identifying what kind of record to expect followed by the record-specific information. We distinguish four kinds of records: TCP, DATA, REQUEST, and HTTP headers.
TCP events are identified by the TCP flag they use: SYN, FIN, RST. In addition we differentiate between the first instance of such an event and additional instances. Most analysis are only concerned about when the first such event happened, yet others (e.g., those that track error conditions) care about repeated TCP signaling and therefore about repeated SYNs, FINs, RSTs. By labeling them differently these are easier to find or eliminate. The specific information is just the timestamp of the packet with the TCP flag.
DATA events summarize the information about an HTTP body, the time of the first packet of the body, the time of the last packet of the body, the length of the body and potentially the filename that contains the data. In addition the information contains a flag that indicates if BLT suspects a missing packet might have created a gap in the data content.
REQUEST events and HTTP headers occur together. The first contains the information if BLT encountered a potential gap and the timestamp information of the first and last packet contributing to this HTTP header. We delimit the raw text of the HTTP header fields with two ``random'' magic numbers to simplify post-processing of the log files. The HTTP header field starts with the magic number 0xa1b2c3d4 and ends with the magic number 0xb1b2c3d4 on a separate line. In between those two magic numbers we store the header length, the start and end sequence number and the start and end acknowledgment numbers and the actual content of the HTTP header fields.
Table 3: Sample entry from a logfile
This means that for flow 211 and the first TCP connection on this flow the first SYN was observed at time 870839085.884436. The first HTTP request on flow 211 1 starting at time 870839086.513424 and ending at time 870839086.513424. The actual HTTP request header was of size 285 bytes started at sequence number 3871952 and ended at sequence number 3872237. The acknowledgments were for sequence numbers 68743 and the actual text of the requests is: HTTP/1.0 200 OK Date: Wed, 06 Aug 1997 03:40:57 GMT etc.
Figure 4: Timeline of a Web transfers
Most Internet service provider (ISP) use hot potato routing to hand traffic off to other ISPs as early as possible, creating lots of asymmetric routes in the Internet. Therefore it is very unlikely that a packet monitor will see the HTTP response that is generated by an HTTP request unless the packet monitor is deployed close to either the Web clients or the Web servers. Close here means that there is exactly one pass from the Web clients or the Web servers to the rest of the Internet and that the packet monitor is on this pass. Since our goal is to be able to deploy BLT at any point in the network, we match HTTP requests with HTTP responses in a separate offline step. This has the additional advantage of reducing processing overheads on the monitoring machine itself. The timeline in the left of Figure 4 shows the of the basic steps in a Web transfer. In the simplest possible case each line corresponds to a single packet. For the purpose of matching HTTP requests and HTTP responses we would like to point out that the HTTP response is the first data that is sent back to the client and will acknowledge the last byte of the HTTP request. Consequently the sequence number of the last byte of the HTTP request should be equal to the acknowledge number from the HTTP request. In addition the first sequence number of the HTTP response should be equal to the acknowledge number of the request. This reasoning holds even if the client and server use persistent TCP connections as long as no HTTP requests are pipelined. If HTTP requests are pipelined (see right timeline in Figure 4), detectable by finding more HTTP requests than HTTP responses during a time interval, the above equalities become inequalities. In this case we need additional information; the logfile contains the information about the index of each HTTP request/response on a given TCP connection. Missing HTTP requests/responses are detected by monitoring the inequalities on the sequence numbers and acknowledgment numbers and timing information. Any inconsistencies are handled by adding/subtracting an offset to the request index number. The matching of requests with responses both for non pipelined as well as for pipelined requests/responses is using the same information that we would have used if the matching had been done online. But for pipelined requests the matching may incorrectly match a request with a response. Matching requests and responses that were collected at different places in the network one has to be especially careful with regards to the clock synchronization of the monitoring machines.
Besides matching the HTTP requests and responses the HTTP header matching software (written in C) produces a second logfile that contains HTTP request/response pair information. The design of this second logfile format is significantly simpler than the design of the original logfile format (Section 6) since it can be recomputed from the initial logfile format. The choice of logfile format was guided by the following concerns:
- Index: There is a logfile entry for every HTTP request/response pair. Therefore the request/response pair provides a good index for this log file.
- Completeness: Relevant information about each HTTP request/response pair should be included. This includes timing information from both the HTTP as well as the TCP events.
- Simplicity: Not every program using this log file should need to parse the full HTTP headers and HTTP responses.
- Privacy: Customer privacy needs to be protected.
To meet these goals the traces that are extracted in an online fashion are processed on a file by file basis. (Requests/response pairs that span more than one file are not matched.) While processing the file the software creates an index of all events and parses the HTTP requests and response headers. Any HTTP header that is questionable (e.g., because of a missing packet or a miss-parsing of the HTTP header pattern) is rejected. While parsing the HTTP headers the presence of certain header fields and their values is noted and stored. Once all events have been processed the requests and responses are matched. Next information about the associated TCP events and DATA transfers is added to the records and a log entry is written.
A logfile entry consists of information that describes the events and entries that make post processing simpler, e.g., a unique index for each HTTP request/response pair. To be able to sort all requests in the order in which they were issued, the first element of the log entry is the timestamp of the first packet of the HTTP request. A flag field is added to flag those request/response pairs who's transfers where affected by a packet loss. The per HTTP request information includes the type of request (one of GET, HEAD, POST), the URL, the referrer field, and the size of the HTTP request header. The per HTTP response information includes the response timestamp, the response code, the number of bytes in the response header, the number of bytes of the data that was received, the content-length and type from the HTTP response header, where appropriate the filename of the file that stores the reassembled content. Additional timestamps are the SYN|FIN|RST timestamps from the sender and receivers, the timestamp of the first DATA packet and the timestamp of the last data packet. The header fields that are extracted from the HTTP headers include pragma, cache, authorization, authentication, refresh, cookie, set-cookie, expire time, if-modified-since, last-modified, and cache-last-checked directives.
A simple indexing schema for our logfile uses the unique flow identifier and the timestamp of the request. Studies [16,29,14] have shown that the logfile contains relevant timing and HTTP header information and as such is fairly complete but also concise and simple. Privacy is achieved by eliminating any reference to even the scrambled IP addresses. The flow identifier still allows the identification of the same source/destination IP addresses.
Data collected by BLT and its predecessors have been used for various Web performance analysis studies. In this section we will point out what part of BLT has enabled each study.
- Benefits of compression and delta encoding :
- This study was possible because BLT is capable of extracting the full content of the HTTP body plus detailed timestamp information. We needed the HTTP bodies to evaluate (1) what percent of all bodies is compressible and to what degree, (2) what percentage of bodies can delta-encoding be applied and to what degree. We used the body of the HTTP responses to estimate the byte savings achieved by delta encoding and compression and we used the detailed timestamp information, HTTP request time, HTTP response time, timestamp of first data packet and last data packet to evaluate the latency savings that could be gained by deploying compression and delta encoding. In addition we used the TCP/IP timestamp information to evaluate the impact of neglecting slow start and per packet dynamics.
- Rate of change :
- Besides providing us with a precise timestamp and a checksum of the HTTP body associated with an HTTP request BLT extracts information from the HTTP headers such as the last-modified timestamp, the age of the resource, etc. This data was used to evaluate the frequency with which different resources are requested, how often they change, what the distribution of the Web page ages is, etc.
- Policies for Web traffic over flow-switched networks :
- The main emphasis of this paper is the analysis of the TCP/IP traces. Still the availability of the across level traces from BLT enabled us to correlate the statistical results about IP-flow distributions with the actual events in the network, such as number of requests with response code 304. This allowed us to explain the observed phenomena in terms of an application and to identify traffic invariants. Datasets extracted by BLT were used in a similar way to reason about traffic invariants in scaling phenomena .
- Performance of Web proxy caching :
- This study considers the impact of cookies, aborted connections, persistent connections on the performance of Web caching in terms of memory usage and latency savings. It would not have been possible without the level of detail provided by BLT about TCP as well as HTTP events.
- Piggyback cache validation /server invalidation :
- Both of these studies address the problem of maintaining cache coherency for proxy caches. At AT&T Labs-Research nobody is required to use a proxy to browse the Web. Since modifying the clients was not a possibility, BLT provided the only way to extract Web client traces necessary to perform Web proxy studies at AT&T Labs-Research. The study took advantage of the time of request field, the last modified time, the status information, the size information and the flow identifier that identifies the clientid and the server.
In addition the studies mentioned above the data collected by BLT has been used to derive various statistics about the popularity of Web sites, the usage of HTTP header fields , the behavior of consumers and researchers browsing the Web.
Yet another use for BLT involves augmenting active measurements by passive measurements, e.g., to measure what the performance of retrieving a Web page is from a Web server. The level of detail, available via BLT, allows us to distinguish DNS delay, from TCP connection setup delay, from delay to process the HTTP request, from delay to send the data. The biggest benefit of using BLT to augment active measurements is that one does not need to use a specially instrumented client. Rather one can use a standard Web client such as Netscape and control it via the remote control features. This approach enables one to separate delays due to rendering at the client from delays due to the network or Web server.
BLT has been designed to allow continuous collection of real world traces at many different locations in the Internet. It has been used to collect several months of real world traces from AT&T WorldNet, a consumer based ISP, and from AT&T Labs Research in Florham Park. BLT is unique in giving us access to HTTP and TCP level traces at the same time. The collected datasets are novel (1) in the degree of detailed information they provide, (2) in presenting us with an client side view of the Web and (3) in the duration of the traces. The latter both challenges and benefits any analysis driven by data collected by BLT. Without datasets such as those collected by BLT, one can only speculate about the Web or construct artificial datasets with all their pitfalls. The richness of the datasets and their completeness have motivated and enabled several studies.
The most important lesson we learned from writing BLT is: expect the unexpected and respect the challenges to the HTTP header reconstruction as discussed in Section 3. Once data about multiple layers in the networking stack is available it provides a playground for many analysis. To avoid preempting these analysis it is important to create well documented, precise, yet complete logfiles. (E.g., include the full HTTP headers.) In extracting the information it is crucial to avoid assumption about how well-behaved either the clients or the servers or the network might be . They are not. Other common lessons from the implementation include: don't try to do too much processing in the time-critical steps of the logfile extraction; simplify wherever sensible and reasonable; reduce memory use and disk I/O. But in the end the most crucial lesson was to never expect a perfect logfile. There will always be one more exception or one more misbehaved client/server. Therefore, the matching software and any analysis program should test whatever assumption the data has to satisfy and eliminate any data that violates the assumption. With enough care the number of requests that are discarded by each step is small.
It is currently possible to monitor links up to 100 Mbits using of the shelf computer components. As link speeds grow the memory and CPU performance of these systems become bottlenecks. In this case the processing of the data could to be pushed closer to the link, e.g., onto the interface cards. Alternatively one could develop special purpose hardware or restrict the observed traffic to the specific subset of interest. There are two options for the later approach, select a subset and perform the same computation or select all and perform a simpler computation that approximates the full computation. The experience collected with tools like BLT are crucial to judge the quality of the resulting datasets.
The software needs to undergo continuous evolution. Even as we are outlining the current design of BLT the next generation is being developed. The new version incorporates, among others, the following significant improvements: (1) There is no notion of files and requests, responses pairs will be properly matched. (2) It is not necessary to parse the data content since the new tool can determine the length of the HTTP content from the HTTP header information unless a RST is encountered. (3) This enables a direct split of the HTTP content from the HTTP header information and has the potential to reduces the overhead of protocol information extraction significantly. (4) The linked list of packets is replaced with a modified splay tree routine that will automatically account for retransmitted packets and/or gaps. Another avenue of future work is to extent the protocol awareness to other protocols such as RTSP. Such protocols add the complication of using dynamically assigned UDP ports for exchanging media data. mmdump  is tool that allows users to monitor such multimedia traffic.
I acknowledge all my colleagues at AT&T Labs that are involved in the measurement effort and their help in developing the software architecture. Special thanks go to A. Greenberg, R. Caceres, N. Duffield, P. Mishra, C. Kalmanek, K.K. Ramakrishnan, and J. Rexford. Many thanks to everyone in WorldNet that made the deployment of the PacketScopes possible.
I am very grateful to J. Rexford and B. Krishnamurthy for many discussions and constructive criticism on the presentation of the material. Many thanks to G. Glass for writing the HTTP header matching software.
0=6 =10 .55 -0 =.9 0
N. Anerousis, R. Cáceres, N. Duffield, A. Feldmann, A. Greenberg,
C. Kalmanek, P. Mishra, K.K. Ramakrishnan, and J. Rexford. Using the AT&T
Labs PacketScope for Internet Measurement, Design, and Performance Analysis.
AT&T Labs-Research Internal TM, 1997.
- 2 G. Apostolopoulos, V. Peris, P. Pradhan, and D. Saha. A self-learning layer 5 switch, 1999. IBM Research Report.
- 3 M.F. Arlitt and C.L. Williamson. Internet Web servers: Workload characterization and implications. In IEEE/ACM Trans. Networking, 5(5):631-644, October 1997.
- 4 M. Aron, P. Druschel, and W. Zwaenepoel. Efficient support for P-HTTP in cluster-based Web servers. In Proceedings of the USENIX 1999 Annual TechnicalConference, 1999.
- 5 H. Balakrishnan, V.N. Padmanabhan, S. Seshan, M. Stemm, and R.H. Katz. TCP behavior of a busy Internet server: Analysis and improvements. In Proc. IEEE INFOCOM, April 1998.
- 6 P. Barford, A. Bestavros, A. Bradley, and M.E. Crovella. Changes in Web client access patterns: Characteristics and caching implications. World Wide Web, Special Issue on Characterization andPerformance Evaluation, 1999.
- 7 P. Barford and M.E. Crovella. Generating representative Web workloads for network and server performance evaluation. In Proc. ACM SIGMETRICS, June 1998.
- 8 R. Caceres, F. Douglis, A. Feldmann, G. Glass, and M. Rabinovich. Web proxy caching: The devil is in the details. In Proc. Workshop on Internet Server Performance, June 1998.
- 9 R. Caceres, C.J. Sreenan, and J.E. van der Merwe. mmdump - A tool for monitoring multimedia usage on the internet, 1999.
- 10 K.C. Claffy, H.-W. Braun, and G.C. Polyzos. A parameterizable methodology for Internet traffic flow profiling. IEEE Journal on Selected Areas in Communications, 13(8):1481-1494, October 1995.
- 11 E. Cohen, B. Krishnamurthy, and J. Rexford. Improving end-to-end performance of the Web using server volumes and proxy filters. In Proceedings of ACM SIGCOMM, September 1998.
- 12 M.E. Crovella and A. Bestavros. Self-similarity in World Wide Web traffic: Evidence and causes. In Proc. ACM SIGMETRICS, pages 160-169, May 1996.
- 13 P.B. Danzig, S. Jamin, R. Cáceres, D.J. Mitzel, and D. Estrin. An empirical workload model for driving wide-area TCP/IP network simulations. Internetworking: Research and Experience, 3(1):1-26, 1992.
- 14 F. Douglis, A. Feldmann, B. Krishnamurthy, and J.C. Mogul. Rate of change and other metrics: A live study of the World Wide Web. In Proc. USENIX Symp. on Internet Technologies and Systems, pages 147-158, December 1997.
- 15 A. Feldmann. Popularity of HTTP header fields, December 1998. www.research.att.com/~anja/w3c_webchar/.
- 16 A. Feldmann, R. Caceres, F. Douglis, G. Glass, and M. Rabinovich. Performance of Web proxy caching in heterogeneous bandwidth environments. In Proc. IEEE INFOCOM, 1999.
- 17 A. Feldmann, A.C. Gilbert, W. Willinger, and T.G. Kurtz. The changing nature of network traffic: Scaling phenomena. ACM Computer Communication Review, 28(2), April 1998.
- 18 A. Feldmann, J. Rexford, and R. Caceres.
Reducing overhead in flow-switched networks: An empirical study of Web traffic. In Proc. IEEE INFOCOM, April 1998.
- 19 S.D. Gribble. System design issues for internet middleware services: Deductions from a large client trace. Master's thesis, U.C. Berkeley, 1997.
- 20 S.D. Gribble and E.A. Brewer. System design issues for Internet middleware services: Deductions from a large client trace. In Proc. USENIX Symp. on Internet Technologies and Systems, December 1997.
- 21 V. Jacobson, C. Leres, and S. McCanne. tcpdump, available at ftp://ftp.ee.lbl.gov, June 1989.
- 22 B. Krishnamurthy and M. Arlitt. Pro-cow: Protocol compliance on the Web, July 1999. Submitted.
- 24 B. Krishnamurthy and C.E. Wills. Study of piggyback cache validation for proxy caches in the World Wide Web. In Proc. USENIX Symp. on Internet Technologies and Systems, pages 1-12, December 1997.
- 25 B. Krishnamurthy and C.E. Wills. Piggyback server invalidation for proxy cache coherency. In Proc. World Wide Web Conference, April 1998.
- 26 T.M. Kroeger, D.E. Long, and J.C. Mogul.
Exploring the bounds of Web latency reduction from caching and prefetching. In Proc. USENIX Symp. on Internet Technologies and Systems, pages 13-22, December 1997.
- 27 G.R. Mallan and F. Jahanian. An extensible probe architecture for network protocol performance measurement. In Proceedings of ACM SIGCOMM, 1999.
- 28 S. Manley and M. Seltzer. Web facts and fantasy. In Proc. USENIX Symp. on Internet Technologies and Systems, pages 125-133, December 1997.
- 29 J.C. Mogul, F. Douglis, A. Feldmann, and B. Krishnamurthy. Potential benefits of delta encoding and data compression for HTTP. In Proc. ACM SIGCOMM, pages 181-194, September 1997.
- 30 J.C. Mogul and K.K. Ramakrishnan. Eliminating receive livelock in an interrupt-driven kernel. In Proceedings of Winter 1996 USENIX Conference, USENIX Association, January 1996.
- 31 V.N. Padmanabhan and J.C. Mogul. Improving HTTP latency. Computer Networks and ISDN Systems, 28(1/2):25-35, December 1995.
- 32 V. Paxson and S. Floyd. Wide-area traffic: The failure of Poisson modeling. In IEEE/ACM Trans. Networking, 3(3):226-255, June 1995.
- 33 K. Thompson, G.J. Miller, and R. Wilder. Wide-area internet traffic patterns and characteristics. IEEE Network Magazine, 11(6):10-23, November/December 1997.
- 34 L. Wall and R.L. Schwartz. Programming perl, O'Reilly & Associates, Inc., 1991.
- 35 R. Wooster, S. Williams, and P. Brooks. HTTPDUMP: a network HTTP packet snooper. 1996.
- 36 R. Wooster, S. Williams, and P. Brooks. HTTPDUMP. http://www.cs.vt.edu/ chitra/httpdump/, 1998.
- 2 G. Apostolopoulos, V. Peris, P. Pradhan, and D. Saha. A self-learning layer 5 switch, 1999. IBM Research Report.
- ... TCP/IP1
- An earlier version of this paper was presented as a position paper at the W3C Web Characterisation Workshop, Nov 1998, Cambridge Massachusetts.