WWW2008 Posters - WWW 2008: Posters
Skip to main content.

Posters


Track: Posters

Paper Title:
HisTrace: Building a Search Engine of Historical Events

Authors:

  • Lian'en Huang(Peking University)
  • Jonathan J. H. Zhu(City University of Hong Kong)
  • Xiaoming Li(Peking University)

Abstract:
In this paper, we describe an experimental search engine on our Chinese web archive since 2001. The original data set contains nearly 3 billion Chinese web pages crawled from past 5 years. From the collection, 430 million “article-like” pages are selected and then partitioned into 68 million sets of similar pages. The titles and publication dates are determined for the pages. An index is built. When searching, the system returns related pages in a chronological order. This way, if a user is interested in news reports or commentaries for certain previously happened event, he/she will be able to find a quite rich set of highly related pages in a convenient way.

PDF version












Inquiries can be sent to: Email contact: program-chairs at www2008.org

Valid XHTML 1.0 Transitional