Web Document Summarization by Context

Web Document Summarization by Context

Jean-Yves Delort
LIP6 - UPMC
8 rue du capitaine Scott
75015 Paris, France
Jean-Yves.Delort@lip6.fr
Bernadette Bouchon-Meunier
LIP6 - UPMC
8 rue du capitaine Scott
75015 Paris, France
Bernadette.Bouchon-Meunier@lip6.fr
Maria Rifqi
LIP6 - UPMC
8 rue du capitaine Scott
75015 Paris, France
Maria.Rifqi@lip6.fr

ABSTRACT

This paper adresses the issue of Web document summarization. We consider the context of a Web document by the set of pieces of information extracted from the content of all the documents linked to it. We put forward two new summarization by context algorithms. The first one uses both the content and the context and the second one relies only on the elements of the context. It is shown that summaries based on the context are usually much more relevant than those only made from the content of the target. Optimal conditions on the size of the content and the context of the document to yield the best summaries are studied.