Next: Undetectable hit inflation for Up: On the Security of Previous: Introduction
In order to understand the hit inflation problem, we first must understand how a legitimate click-through is manifested in HTTP protocol messages. Our initial treatment is for the simple case of a click-through program run directly by a target site for referrers. The case of a third-party click-through program provider will be discussed subsequently.
Let R denote a referring site, T denote the target site, and U denote a user's Web browser. A click-through begins when U retrieves a Web page pageR.html from R that contains a hypertext link to a page pageT.html on site T (see Figure 1). When the user clicks on that link, the user's browser issues a request to site T for pageT.html. An important component of this request is the Referer header of the HTTP request for pageT.html. This header is set by the user's browser and names the Web page that "referred" the user to pageT.html, in this case pageR.html. T uses this Referer header to record the URL of the page that referred the user to pageT.html, along with the IP address of U. T then returns pageT.html to U for display in the browser.
In a click-through payment program, T will periodically pay R some previously agreed-upon amount for each click-through from R to T. The fact that T pays for click-throughs provides to R an incentive to mount hit inflation attacks on T, in which R somehow causes T's record of click-throughs from R to be increased above the correct number. Here we do not define precisely what the "correct number" is. Rather, we simply characterize a hit inflation attack as one in which T receives a request for pageT.html with a Referer header naming pageR.html when no corresponding Web user clicked to pageT.html after viewing pageR.html. For example, a straightforward attempt to inflate R's click-through count is for the webmaster of R to run a program that repeatedly sends requests of the appropriate form to T. However, because most click-through programs pay only for "unique" referrals, i.e., click-throughs from users with different IP addresses, multiple click-throughs where the user is at the same site are counted as only one click-through for payment purposes. On the side we remark that counting unique IP addresses is becoming increasingly ineffective, as more user requests are directed through proxy servers either due to the default configuration of the user's ISP (e.g., 99% of AOL subscribers) or to enhance user privacy.
A sophisticated attacker could issue multiple requests to T with forged IP source addresses, thereby circumventing the unique referrals rule. However, this requires a further degree of technical sophistication and effort on the attacker's part (see, e.g., ). Moreover, these attacks can be detected by T, due to the fact that in all likelihood, no browser will receive the response from T. So, for example, if pageT.html is constructed with links to images or other HTML elements that a browser would immediately retrieve upon interpreting pageT.html, then a request for pageT.html with a forged IP source address will not be followed by requests for the HTML elements contained in pageT.html. If it is feared that the attacker will go one step further and even issue these follow-up requests in a predictive fashion to avoid detection, then T can dynamically generate pageT.html each time with links to different URLs (in the limit, containing a nonce in the URL), thereby foiling any such attempt by the attacker to predict the URLs to request. The end result is that requests with forged IP addresses will stand out to T as those for which correct follow-up requests were not received. Moreover, the perpetrator of this attack will be revealed by the Referer field of these requests, as this Referer field must indicate the referrer that is trying to inflate its hits.
Because of the difficulty and detectability of IP address forgery attacks, probably the most common form of hit inflation today is one in which the referrer R forces the user to visit the target T by constructing pageR.html so as to automatically "click" the user to pageT.html (e.g., see ). This simulated click can be accomplished using constructs that will also play a role in our attacks; we thus defer an explanation of these techniques to Section 3. This simulated click can be visible to the user, in which case the user will see, e.g., a new window popped up on his screen unsolicited and containing pageT.html. Alternatively, the window can be hidden from the user (e.g., behind the window containing pageR.html), so that the user is unaware that she is being "used" by R to gain payment from T. Regardless of whether this hit inflation is visible to the user, it is still the case that these attacks can be detected by T if the webmaster of T periodically visits the Web pages of the referrers that she pays (preferably from a machine outside her own domain, to avoid detection by the referrer). By inspecting the constructions in those Web pages, and observing the behavior of these pages when interpreted by her browser, the webmaster of T can detect that hit inflation is occurring. Indeed, this examination could even be automated, as it suffices to detect if the referrer's page, when interpreted, causes a request to T's site automatically.
There are numerous variations on click-through programs as described above. In particular, in a program run by a third-party provider, the interaction differs from the above description in that the third party takes the place of T. The third party records the click-through and then redirects the request to the actual target site. Another variation is that some click-through programs do not make use of the HTTP Referer header, but rather simply have each referrer refer to a different URL on the target site. This approach has the advantage of not relying on the Referer field to be set correctly and thus functioning in conjunction with privacy-enhancing tools that eliminate the Referer field in the HTTP header. However, this approach exposes the click-through program to additional risks: in particular, the referrer webmaster can broadcast-email ("spam") his own banner ad to increase its click-through count. Thus, most click-through programs of this form explicitly prohibit spamming to increase click-throughs, and will cancel the referrer's account if the referrer is detected doing so.
None of these variations deter the attack we present in Section 3. On the contrary, if the Referer header is not used by the target site, then our attack becomes easier, as will be discussed in Section 3.
Next: Undetectable hit inflation for Up: On the Security of Previous: Introduction Mike Reiter