WWW2007 Paper Details
Track:
Semantic Web
Paper Title:
Yago: A Core of Semantic Knowledge - Unifying WordNet and Wikipedia
Authors:
  • Fabian M. Suchanek (Max-Planck-Institute for Computer Science)
  • Gjergji Kasneci (Max-Planck-Institute for Computer Science)
  • Gerhard Weikum (Max-Planck-Institute for Computer Science)
Abstract:
We present YAGO, a light-weight and extensible ontology with high coverage and quality. YAGO builds on entities and relations and currently contains roughly 900,000 entities and 5,000,000 facts. This includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as hasWonPrize). The facts have been automatically extracted from the unification of Wikipedia and WordNet, using a carefully designed combination of rule-based and heuristic methods described in this paper. The resulting knowledge base is a major step beyond WordNet: in quality by adding knowledge about individuals like persons, organizations, products, etc. with their semantic relationships -- and in quantity by increasing the number of facts by more than an order of magnitude. Our empirical evaluation of fact correctness shows an accuracy of about 95%. YAGO is based on a logically clean model, which is decidable, extensible, and compatible with RDFS. Finally, we show how YAGO can be further extended by state-of-the-art information extraction techniques.
Slot:
Shaughnessy, Saturday, May 12, 2007, 10:30am to 12 noon.
Full-text:
PDF version