SmartView: Flexible Viewing of Web Page Contents
Microsoft Research Ltd.
7 J J Thomson Avenue
Cambridge CB3 0FB
+44 (0)1223 479 700
Microsoft Research Ltd.
7 J J Thomson Avenue
Cambridge CB3 0FB
+44 (0)1223 479 700
SmartView is a functionality built into a document viewer that performs partitioning of an HTML document content into logical sections that can further be selected by the user and viewed independently from the rest of the document. The SmartView interface enforces the concept of a document by allowing the user to view both the document overview (e.g., a zoomed out version of the document, a document thumbnail(s), etc.) and a detailed view of the selected section of the document. The layout of the detailed view is modified to achieve the presentation that is desired by the user or is optimal for a device.
The SmartView functionality is essential in cases when the document layout is optimized for displays of certain size but need to be viewed on smaller devices a situation that requires significant document scaling and layout changes (e.g., viewing of Web pages on PDAs). It is equally important in instances when only a portion of a document needs to be displayed (e.g., showing a section of a Web page on a large screen). The current implementation of SmartView is compatible with the Microsoft Internet Explorer v.6.0.
Document viewer, Web Browser, PDA, Large screen, flexible layout, page layout
Ability to display digital documents on a variety of devices has become an issue of high importance in on-line document publishing. For example, with the emergence of e-books there is a great demand for efficient dynamic modification of the book layout in order to accommodate the users viewing preferences, such as a page size, font size, font type, and similar.
Furthermore, with the proliferation of mobile phone use to access the Web pages, the Web sites have resorted to creating content pages specifically for the use with mobile phones. At the same time, the Browsers on PDAs are applying a number of scaling heuristics to provide a reasonable viewing of Web pages. Unfortunately, the right trade-off between readability of the content and the amount of horizontal and vertical scrolling required to view a page is very difficult to achieve.
Finally, the ability to zoom-in onto a part of a document remains a functionality of specialized viewers (e.g., Adobe pdf viewer or GhostView for PostScript documents). Even there, the experience is rather unsatisfactory since no layout considerations are taken into account. We would like to support the user in displaying any part of the document in an optimal manner. A typical application would be a display of a portion of a Web page within a Browser so that it could be projected onto a large screen.
SmartView is a prototype application that attempts to solve both the scaling down and the scaling up problems when viewing HTML documents. It analyses the layout of an HTML document and partitions it into logical sections that can further be selected by the user and viewed independently from the rest of the document.
In the following section we briefly describe the design and implementations of the SmartView.
2. SmartView Design and Implementations
Our design of SmartView has been driven by the two main applications we had in mind: viewing of Web pages on PDAs and on large screens. We implemented two versions of SmartView for the Microsoft Internet Explorer (IE), one for the browser running on PDAs and the other for a desktop PC version, in particular the IE v.6.0.
The only reason for two designs is the discrepancy in capabilities of the two browsers. The IE v.6.0 supports zooming in and out function while the versions running on PDAs do not. In both implementations we kept an important specification requirement in mind: the GUI needs to provide a simultaneous viewing or and easy switch between the full view of a Web page (e.g., in the form of a zoomed out view of the page or a thumbnail image of the page) and the detailed view of the selected part of the document. Both implementations of the SmartView involve the following steps:
Analysis of the HTML page layout using heuristics about the use HTML structures, in particular the use of tables)
Partitioning of the page into logical units typically partitioning by table elements)
Determining the appropriate zooming out level for a page (if done on the desktop) or a size of a thumbnail image for the PDA in order to provide the visual overview of the page standard PDA screen size is used as a predefined target size
If desired, marking the page partition on the zoomed out page or overlaying it on the thumbnail image of the page
Extracting the portion of the HTML that corresponds to the selected part of the document
Creating a new HTML document for the selection, with modified layout to conform to the device display specifications or the users viewing preferences.
SmartView for PDAs - Proxy Implementation
SmartView for the PDAs relies on a proxy service that performs the analysis of the page layout and page partitioning, thumbnail creation, and layout modification. As the user of the PDA makes a request for a Web page, the proxy downloads the page, creates a thumbnail image of the page, performs the analysis and partitioning of the page, and sends to the PDA Browser a thumbnail image of the page with the partition details. When the user selects a particular section on the thumbnail, the proxy responds by extracting the HTML code of the desired section, creating a new HTML document that satisfies the new layout specifications, and delivering the new document to the PDA Browser for display.
SmartView for Desktop PC - Client Implementation
SmartView for IE v.6.0 performs all the analysis of the page on the client side. It uses the IE zoom out facility to create the overview representation of the page, e.g., a view of the whole document in the view port with indicated partition of the document. It then allows the user to select one of the elements in the partition and creates a new HTML document that will be displayed on the screen. On a Desktop PC a zoomed out view of the page can be made readily available (with highlighted section that is currently being viewed) so that the user can easily use it as a reference view and a tool for navigating through the document.
Fig. 1: Web page with complex layout partitioned into logical units (left).
These units can be individually selected and displayed (right).
3. Related Research
Most of the attempts to overcome the problem of document display on small devices are focused on presenting the user with information contained on the page in some form rather than the page itself. For example, in  and  the authors focused on devising methods to create suitable summaries of a single or multiple pages and present those to the user. While there are many benefits of this approach, we believe that there is a value in delivering the content of a page as originally designed by the author and, at the same time, allow the user to pick and chose what might be of interest for detailed viewing. Furthermore, we believe that in many situations discrepancy between the page views on different devices can hinder the optimal use of information.
We should point out that research on automatic extraction of the document architecture, including the logical and reference structure, has been conducted to some degree in the context of optimizing text authoring tools (see ). Future extensions of the SmartView to other document formats would certainly include some of these more sophisticated document analyses.
Furthermore, the idea of various partial views of Web documents has been explored in the context of collaborative Web Browsing in . However, this work is restricted to Web pages created within specialized XML framework designed to enable multi-device and multi-user viewing of document content. Since it relies on specialized XML tags and XML splitting policy defined by the author, this approach is not easily extendible to the general HTML Web pages. On the other hand, in our approach based on the automatic analysis of HTML pages we can easily encode the results of the analysis into HTML to avoid repeated processing of the page as well as further enhance the document with the ability to receive the device specifications from the device placing a request.
Finally, modification of a document layout to accommodate various devices has been explored in the context of e-book research (see ). In the current prototype, we implemented rather simple layout adjustments using heuristics for re-flowing content of HTML tables which are typically used to implement the layout of Web pages.
4. Future Work
Focus of our future work will be extensive user evaluation of the SmartView prototype and design modifications based on the user feedback. We will also attempt to extend the SmartView functionality to a variety of document formats.
- Buyukkokten, O., Garcia-Molina, H., Paepcke, A. and T. Winograd. Power Browser: Efficient Web Browsing for PDAs. In Proceedings of the ACM Conference on Computers and Human Interaction 2000 (CHI00), 2000.
- Buyukkokten, O., Garcia-Molina, H., Paepcke, A. Seeing the Whole in Parts: Text Summarization for Web Browsing on Handheld Devices. In the Proceedings of the Tenths International World Wide Web Conference (WWW 10), 2001.
- Iwai, I., Doi, M., Yamaguchi, K., Fukui, M., and Y. Takebayashi. A document Layout System Using Automatic Document Architecture Extraction. In the Proceedings of the ACM Conference on Human Factors in Computing Systems, (CHI'89), 1989.
- Han, R., Perret, V., and M. Haghshineh. WebSplitter: A Unified XML Framework for Multi-Device Collaborative Web Browsing. In the Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW00), 2000.
- Thacker, Ch. and R. Sommerer. A Prototype Electronic Book. (Unpublished but available on-line at http://research.microsoft.com/~som), 1999.