WWW2006 - Refereeed Track: Data Mining
| Skip to main content | Skip to navigation |

Register Now!

Refereed Track: Data Mining

With the phenomenal growth of the Web, there is an ever-increasing volume of data and information being published in Web pages. The research in Web data mining aims to develop new techniques to effectively extract and mine useful knowledge/information from these Web sources. Due to the heterogeneity and the lack of structure of Web data, automated discovery of targeted or unexpected knowledge is a challenging task. It calls for novel methods that draw from a wide range of fields spanning data mining, machine learning, natural language processing, statistics, databases, and information retrieval.

For the data mining track, we invite original and high quality submissions addressing all aspects of Web data mining. The relevant topics include, but are not restricted to, the following:

  • Classifying, clustering and recommending text/Web documents
  • Mining Web content, link structure and usage data
  • Building user profiles and providing recommendations
  • Spatio-temporal analysis of blogs, reviews, discussions
  • Change detection and monitoring Web pages/sites
  • Entity and relationship extraction
  • Schema and data integration, data cleaning
  • Integrating linguistic and domain knowledge in Web mining
  • Privacy preserving Web data mining

Accepted Papers

Finding Advertising Keywords on Web Pages
Wen-tau Yih
Joshua Goodman
Vitor R. Carvalho
A Probabilistic Approach to Spatiotemporal Theme Pattern Mining on Weblogs
Qiaozhu Mei
Chao Liu
Hang Su
ChengXiang Zhai
A Comparison of Implicit and Explicit Links for Web Page Classification
Dou Shen
Jian-Tao Sun
Qiang Yang
Zheng Chen
Large-Scale Text Categorization by Batch Mode Active Learning
Steven C. H. Hoi
Rong Jin
Michael R. Lyu
What's Really New on the Web? Identifying New Pages from a Series of Unstable Web Snapshots
Masashi Toyoda
Masaru Kitsuregawa
Interactive Wrapper Generation with Minimal User Effort
Utku Irmak
Torsten Suel
Automatic Identification of User Interest For Personalized Search
Feng Qiu
Junghoo Cho
Improved Annotation of the Blogosphere via Autotagging and Hierarchical Clustering
Christopher H. Brooks
Nancy Montanez
Time-Dependent Semantic Similarity Measure of Queries Using Historical Click-Through Data
Qiankun Zhao
Steven C. H. Hoi
Tie-Yan Liu
Sourav S Bhowmick
Michael R. Lyu
Wei-Ying Ma


  • Ramakrishnan Srikant (Google)
  • Soumen Chakrabarti (IIT Bombay)
  • Email: www2006-datamining-chairs@www2006.org

PC Members

  • Corin Anderson, (Google)
  • Roberto Bayardo, (IBM Almaden Research Center)
  • Ming-Syan Chen , (National Taiwan University )
  • Byron Dom, (Yahoo! Inc.)
  • Tina Eliassi-Rad, (Lawrence Livermore National Laboratory)
  • Charles Elkan, (CSAIL, MIT)
  • Ronen Feldman, (Bar-Ilan University)
  • Rayid Ghani, (Accenture Technology Labs)
  • David Gibson (IBM Almaden Research Center)
  • Aristides Gionis, (University of Helsinki)
  • Daniel Gruhl, (IBM Almaden Research Center)
  • Ramanathan Guha, (Google)
  • Thomas Hofmann, (Technical University of Darmstadt and Fraunhofer IPSI)
  • Bing Liu, (University of Illinois at Chicago)
  • Wei-Ying Ma, (Microsoft Research Asia)
  • Shinichi Morishita, (University of Tokyo)
  • Rajeev Motwani, (Stanford University)
  • Ion Muslea, (Language Weaver, Inc.)
  • Dmitry Pavlov, (Yahoo! Inc)
  • Prabhakar Raghavan, (Yahoo! Inc)
  • Raghu Ramakrishnan, (University of Wisconsin, Madison)
  • Matthew Richardson, (Microsoft Research)
  • Myra Spiliopoulou, (Otto-von-Guericke-University Magdeburg)
  • Jaideep Srivastava, (University of Minnesota)
  • Philip S. Yu, (IBM Watson Research Center)
  • Osmar Zaiane, (University of Alberta)

Organised by

ECS Logo

in association with

BCS Logo ACM Logo

Platinum Sponsors

Sponsor of The CIO Dinner

Become a sponsor or exhibitor
Valid XHTML 1.0! IFIP logo WWW Conference Committee logo Web Consortium logo Valid CSS!