Track: Data Mining
Demographic Prediction based on User's Browsing Behavior
- Jian Hu (Microsoft Research Asia)
- Hua-Jun Zeng (Microsoft Research Asia)
- Hua Li (Microsoft Research Asia)
- Cheng Niu (Microsoft Research Asia)
- Zheng Chen (Microsoft Research Asia)
Demographic information plays an important role in personalized web applications. However, it is usually not easy to obtain this kind of personal data such as age and gender. In this paper, we made a first approach to predict users gender and age from their Web browsing behaviors, in which the webpage view information is treated as a hidden variable to propagate demographic information between different users. There are three main steps in our approach: First, learning from the web-page click-though data, Web pages are associated with users (known) age and gender tendency through a discriminative model; Second, users (unknown) age and gender are predicted from the demographic information of the associated Web pages through a Bayesian framework; Third, based on the fact that Web pages visited by similar users may be associated with similar demographic tendency, and users with similar demographic information would visit similar web pages, a smoothing component is employed to overcome the data sparseness of web click-though log. Experiments are conducted on a real web click-through log to demonstrate the effectiveness of the proposed approach. The experimental results show that the proposed algorithm can achieve up to 30.4% improvements on gender prediction and 50.3% on age prediction in terms of macro F1, comparing with baseline algorithms.