WWW2002 - gNCX: Structure-based Navigation to Enhance Usability of Multimedia Content

gNCX: Structure-based Navigation to Enhance Usability of Multimedia Content

Markku T. Hakkinen
Information Center
Japanese Society for Rehabilitation of Persons with Disabilities
P.O. Box 6632, Lawrenceville, New Jersey 08648 USA
Marisa DeMeglio
Information Center
Japanese Society for Rehabilitation of Persons with Disabilities
Hiroshi Kawamura
Information Center
Japanese Society for Rehabilitation of Persons with Disabilities
7F Shinjuku Mitsui Bldg No 2
3-2-11, Nishishinjuku
Shinjuku-ku, Tokyo
160-0023, JAPAN


Recent developments in the design of digital talking books for the visually disabled has resulted in the creation of a model of document navigation which can offer enhanced usability for a broad base of web-based content. The navigation model allows the semantic structure of a document, document set, or presentation to be represented external to the content, and utilized by a navigation-aware user agent. Using the navigation structure, a user agent can immediately expose the overall structure of the content to the user, and provide efficient navigation without the need to process the entire document set. Application of the navigation model to re-purposing of content, streaming multimedia presentations, and web devices is discussed and a specific example of an emergency information presentation system is shown.


Accessibility, disability, multimedia, open standards, usability, navigation, user interface, synchronized multimedia, device independence.


Digital Talking Books have been developed as a modern hybrid of the traditional audio book integrated with the full textual version of the corresponding publication, allowing the listener a rich reading experience that allows rapid navigation and exploration of both modalities of presentation. The concept of structured audio navigation was pioneered by the Daisy Consortium, and has led to the creation of the Daisy 3.0/NISO Digital Talking Book Standard [1]. A key goal in the Daisy standards effort is the adoption and promotion of open standards that are implementable across multiple platforms and support internationalization, with specific attention paid to the information needs of developing countries.

The NISO standard defines a Digital Talking Book publication as a set of XML documents, W3C SMIL documents, and digital audio streams. Key to the user interaction with the digital talking book is the Navigation Control definition (NCX). The NCX is an XML application that creates a navigation layer on top of digital talking book content. Arising out of a need to move beyond the traditional linear, sequential model of listening to audio tapes, early work on structured audio navigation led to the definition of a formal mark-up specification to represent a publication's navigation structure [2].

A producer of a digital talking book, who may or may not be the original content author will, using authoring tools, define what are navigationally significant elements in the source content. These elements become the exposed entry points into the DTB presentation, and are structured into a hierarchy that reflects the structure of the source content. Because of the diversity of source styles for DTBs, which range from well-structured XML files, to flat, unstructured legacy HTML, to books which may consist of audio only, the NCX can be used to provide a structured navigation flow over non-structured content.

Figure 1: Basic Structure of Digital Talking Book

Figure 1: Basic Structure of a Digital Talking Book

2. gNCX: Design, Authoring, and Navigating

Throughout the development of the NCX, the extensibility of the structuring model offered promise for application to a broader range of content, including SMIL multimedia presentations, electronic publications, MP3 playlists, and content to be delivered via mobile devices. To move beyond the digital talking book, we are undertaking a formal effort to define what we call gNCX, or generalized NCX. Where the NCX was designed specifically for the requirements of the Digital Talking Book, the gNCX extends the model with additional metadata and features to provide a robust language for describing navigational models for diverse content. The gNCX XML application is designed for easy implementation as a plug-in or wrapper for existing browsers and media players, or as a dedicated thin user agent on a variety of devices, such as portable entertainment players or digital televisions.

One advantage of the gNCX is that it can supply structure where it was not originally authored in the source mark-up. Flat or otherwise non-structured documents (or presentations) that did not include explicit structure mark-up, can, through the gNCX authoring process, have the navigationally significant points identified and annotated. SMIL presentations, for example, allow definition of complex sequences of multimedia content, but the language itself contains no mechanism for adding navigational semantics. In addition to adding navigation structure, application specific navigation orders can be authored to supplement or re-purpose an existing set of documents or presentations. The implication is that a content producer (for example, an editor, anthologist, or educator) can customize the navigational significance of the source material, allowing for the easy creation of alternate views or entirely new, composite works, without touching or modifying the source content. The gNCX model may thus prove useful in creating the user interface for repurposed content from multimedia repositories [3] .

2.1 Design of gNCX

A gNCX document consists of several key structures:

2.2 Navigating with gNCX

User agents that implement gNCX can render the navigation structure in a variety of forms. Existing implementations of the digital talking book systems, from which the gNCX is derived, utilize visual, tree views of the table of contents, audio only interfaces, and multi-modal interfaces. These interfaces are running on a variety of platforms, from traditional PCs, to handheld devices (with and without displays), to telephony-based interfaces. Experiments have also been undertaken to transform the NCX into traditional HTML for rendering in non-NCX aware browsers, as well as to VoiceXML for interaction via the telephone.

One area we have under investigation is to utilize gNCX structural navigation to enhance streaming multimedia presentations. The specific application is to create accessible and easily navigable presentations of emergency information. By utilizing structured navigation incorporating audio, text, and graphic cues, along with a gNCX aware multimedia player, our goal is to provide effective information transfer and review for persons with physical and cognitive/intellectual disabilities. An example of this interface is shown in Figure 2.

Prototype of an accessible multimedia player interface using gNCX for an emergency information guide.
Figure 2. Prototype of an accessible multimedia player interface using gNCX for an emergency information guide.


Through the work to address the specific needs of the visually impaired, a navigation model was created to enhance usability of digital talking books. This model has been extended to apply to a broad range of content and interface styles and can serve to make multimedia information more accessible to persons with and without disabilities. Based upon open standards, gNCX will be implementable across a variety of plaftforms.


  1. NISO. File Specifications for the Digital Talking Book: ANSI/NISO Z39.86-200X.
  2. M. T. Hakkinen and G. Kerscher. Applying a Navigation Layer to Digital Talking Books: SMIL, XML and NCX . The Web and Multimedia Workshop - WWW9 Amsterdam, May 2000.
  3. S. Ujitdehaage, C. Chandler and S. Dennis. Supporting health sciences education with IMS-based multimedia repository. WWW10 Conference Proceedings, May, 2001.