Discussion next up previous
Next: Pay-per-sale and pay-per-lead Up: Undetectable hit inflation for Previous: Detailing the attack

Discussion

 

The attack detailed in this section is effective even if a third-party click-through program provider is used. In this case, T is the third-party provider and not the actual target site, but this distinction has no bearing on the mechanism behind our attack. Another difference is that third-party programs often do not make use of the Referer header for identifying the referrer, but rather simply use a different URL per referrer. In this case, however, our attack just becomes easier since there may be less of a need to retain the correct Referer header when performing simulated clicks.

Our attack has other implications. As mentioned in Section 2, most click-through programs are not agreeable to the use of spamming by a referrer to increase click-through counts, and in fact, many click-through programs explicitly prohibit the use of spamming in their contracts with referrers. Our attack, however, makes target sites susceptible to "indirect" spamming that is hard to detect: A spammer (an agent of S) can drive a large number of users to S, triggering the inflation attack. The lack of an obvious relationship between R and the spammer or S makes it difficult for the webmaster of T to detect this practice.

Many click-through programs desire "high quality" referrals, i.e., referrer sites with a targeted audience (e.g., technology oriented sites). Our attack enables a referrer site R with appropriate content to register in the click-through program, while using a different site S with completely different content to attract the click-throughs. Furthermore, many click-through programs disallow referrers with illicit material, regardless of their popularity. Our attack enables referrers R to use such sites to draw users and register click-throughs for R at the target.

To see the potential for profit from this attack, consider that the average click-through rate for banner ads is 1-3%, and that payments for click-throughs are calculated accordingly. Our attack can yield an effective rate of almost 100% for users who visit pageS.html and thus (unknowingly) click through pageR.html to pageT.html. We can go a step further and use S in conjunction with several (say 10) sites R1,...,R10 that are enrolled in different click-through programs, and thereby get an effective click-through rate of 1000%. This is undetectable as long as the different target sites do not compare the IP addresses from which they receive clicks at the same time. (Thus, this multi-target attack might be impossible with target sites that are on the same third-party click-through program).

An attacker might draw suspicion if the target site T monitors the click through rate (CTR) of its ads. The target can monitor the CTR if R's page is required to load the ads from a site that is controlled by the target. A high click-through rate (say greater than 5%) is likely attract the attention of the target's webmaster, if only to learn the marketing practices of the referrer. The attacker can prevent such inquiries by keeping the CTR low. One way to achieve this is to register site R with, say, 20 different targets. Whenever R receives a request with a Referer field naming pageS.html, it returns a page containing ads for each of the targets, and performs a simulated click on one of these ads at random. The attacker is paid for 100% of the visits to S, while keeping the CTR below 5% at each target. This method can of course be extended to achieve lower CTR or higher payment rates.

Another way for the target site T to detect the attack is to search for pages that have links to pageR.html, in an effort to find pageS.html. A simple approach would be to use existing search engines to find pages that refer to pageR.html.[*] However, S can easily avoid detection by serving a different, benign version of pageS.html to spiders of search engines.[*] A second approach that T can try is to perform the search for pages like pageS.html itself, using a spider. This reconnaissance operation is of almost the same scale as building a search engine, and can be complex and costly. Moreover, R and S can extend the attack in a natural way to use a chain of three or more simulated clicks, e.g., from some S' to S to R to T. This further complicates efforts to "trace backward" along the chain to find the page that initiates the attack.

Probably the most viable way of detecting the attack is for T to monitor user activity (e.g., mouse movement, mouse clicks, or filling out a form) on T's site. A real user will typically either click further into the site or leave the site immediately. The former is easily detectable and confirms the existence of a real user. To detect the latter case, pageT.html could be constructed to include a "back" button that both returns the user to the referrer page and informs T that the user clicked on this button. However, this does not capture the case that a user next directs her browser to a bookmarked location, uses the browser's "Back" button to leave T's site, or closes the window containing pageT.html. Similarly, pageT.html could be constructed with JavaScript code to inform T of mouse movement over pageT.html, or to inform T of the length of time that the page was active in the browser (e.g., by causing a message to be sent to T every few seconds). The latter offers little information to T if pageR.html closes the window containing pageT.html after a random amount of time. The former, i.e., detecting mouse movement over pageT.html, possibly lets T confirm that a user sees the page (if the user moves the mouse over it). However, again it does not enable T to determine that a user did not see the page. In the limit, T could occasionally serve a version of pageT.html that contains a newly generated question for the user to answer (and perhaps offers a financial incentive to do so), to see if a user responds.

While none of these techniques can offer proof that the attack is taking place, they can offer T statistical evidence of the attack if the attack is mounted aggressively through a single referrer R. As such, detecting user activity seems to be the most promising direction for coping with this attack, and in fact is the same principle that is behind pay-per-lead and pay-per-sale schemes discussed in Section 4.

Finally, it is worth noting that legal means could also be used to discourage hit inflation attacks. Extreme hit inflation attacks could be grounds for a civil lawsuit if detected. If the threat of civil action is combined with suitable criminal penalties, these threats may effectively deter large-scale hit inflation.


next up previous
Next: Pay-per-sale and pay-per-lead Up: Undetectable hit inflation for Previous: Detailing the attack
Mike Reiter
3/9/1999