Boris Rogge
Multimedia Lab - Ghent University
Sint-Pietersnieuwstraat 41
B-9000 Ghent, Belgium
+32 (0)9 264 89 11
Rik Van de Walle
Multimedia Lab - Ghent University
Sint-Pietersnieuwstraat 41
B-9000 Ghent, Belgium
+32 (0)9 264 33 68


In this poster, we describe the validation process of a functional metadata (FMD) document encapsulated in an MPEG-21 digital item. The validation starts with an MPEG-21 digital item containing the functional metadata and ends with a valid functional metadata document or with a list of error codes. Since the W3C schema language does not suffice to validate all relationships within a functional metadata document, a set of extra rules is defined. These rules are implemented through the use of transformations on functional metadata documents. These transformations are implemented using XSLT.


MPEG-21, Functional Metadata, XSL transformations, W3C XML Schema


The need for content description in modern multimedia application has been recognized by numerous researchers. The most thorough effort in this domain is undoubtedly the Multimedia Content Description Interface, commonly known as MPEG-7 [1]. MPEG-7 offers an extensive framework for describing the content of multimedia documents. This content ranges from simple user preferences over audio sample descriptions to full featured texture description matrices. Rogge et. al. introduced the concept of Functional Metadata (FMD) in [2,3]. Whereas MPEG-7 descriptors and descriptor schemes enable the description of the multimedia content, FMD enables the description of functionality associated with this multimedia data. The functional metadata framework provides a method for defining functionality through the use of XML-based descriptors. A set of FMD definitions is grouped into an MPEG-21 Digital Item [4] thus enabling the coupling between multimedia data (resources), content description data (MPEG-7 descriptors and descriptor schemes) and functional metadata.


Functional metadata is defined through an XML-schema [5,6,7] in combination with a set of extra rules. An XML-schema can describe the document structure and the uniqueness constraints within a functional metadata document. However, the conditional constraints can not be described using an XML-schema. In order to describe all relationships within a functional metadata document, a set of extra rules is enforced through the use of XSLT transformations [8]. In spite of the fact that the FMD framework only describes the document structure and allows the description of the different relationship types, we can state that the framework consists of three parts. A first part is the encoder which creates functional metadata documents. A second part is the storage and/or transmission of the functional metadata documents. The third and last part of the framework is the decoder. The complete architecture is shown in Fig.1. A terminal builder implementing the FMD architecture can select existing technologies to implement the architecture or can choose to create proprietary tools. The implementation shown in [3] uses QuickTime as a starting point for implementing the architecture. The actual FMD-schema can be seen at In order to use functional metadata, an encoder (which generates the correct functional metadata descriptors) and a decoder (which interprets the received descriptors and enables instantiation and execution of these descriptors) must be created by the implementor of the framework. Note that a possible implementation of such an encoder and decoder is discussed in [3].

Figure 1: FMD architecture overview.


The functional metadata frame is used as a transport/storage unit that holds the descriptions to be associated with the multimedia data (at a certain point in time). The functional metadata stored within such a frame is tightly coupled with the multimedia through the use of an MPEG-21 digital item. Table 1 shows a typical MPEG-21 digital item used within the FMD framework. This digital item consists of a single Item containing a Descriptor and a Component. The Descriptor is used to describe the FMD frame content. The Component combines the functional metadata and reference to the resource it is associated with.

<didl:DIDL xmls:didl="urn:mpeg:mpeg21:2002/01-DIDL-NS" xmlns:xsi="">
        <didl:Descriptor id="frameDescriptor">
                <!-- Description of the functional metadata frame -->
        <didl:Component id="componentID">
              <didl:Descriptor id="frameContent"> 
                    <didl:Statement type="urn:mpeg:mpeg21:data-format:IPTCMime-TypeCS:text/xml">
                        <!-- Functional metadata frame is to be inserted here -->
              <didl:Resource ref="resourceURL" type="dataType" />
Table 1: MPEG-21 digital item.


Since functional metadata is XML based, an XML-schema for functional metadata documents is used to validate the documents. However, the current standard for W3C schema does not provide all necessary means needed to validate a complete functional metadata document. There are three types of relationships used in a functional metadata document. The first type of relationship defines the document structure. A validating parser can check the validity of these relationships for a particular functional metadata document. The FMD-schema defines the different parts of an FMD document [2]. A second type of relationships is defined as unique constraints. These relationships make sure that all functional metadata constructs are uniquely identifiable and that certain metadata constructs can only occur once in a functional metadata document. They are enforced using the unique and key mechanisms available in the W3C schema language. The third and last type of relationship can not be enforced using the W3C schema language. We call this type of relationship conditional constraints. Such a conditional constraint enforces a certain rule when a particular condition is met. An example of such a constraint could be: if the attribute A of element E1 is present, than element E2, being a child of element E3, must also be present. Since the W3C schema language does not support this type of constraints an extra validation step is introduced. The validation of the third type is done by means of XSLT stylesheets.

Fig. 2 gives a high-level overview of the W3C schemas and the chain of transformations (XSLT stylesheets) applied to the original MPEG-21 document. Firstly, the conversion from an MPEG-21 digital item containing functional metadata into a valid functional metadata document is done through a combination of XML-schema and XSLT. The first two types of relationships are validated based on a set of schemas using a validating parser. The third type of relationship is validated using a transformation described in an XSLT stylesheet. The validation chain shown in Fig. 2 starts with an MPEG-21 document called fmd-didl.xml. The content of this document is compliant with the document structure shown in Table 1. Firstly, the content of the MPEG-21 document is validated against the mpeg21.xsd schema. When no errors are found, the second validation step is performed. The didl.xslt stylesheet extracts the functional metadata from the MPEG-21 document and stores this information in a file called fmd-frame.xml. This document is validated using the FMD-schema (fmd.xsd) before the next transformation (defined in fmd-merge.xslt) checks whether the FMD document is complete. If not, the necessary items are added to the document out of the decoder's repository by the  fmd-merge.xslt transformation. The final step is the validation of the newly generated fmd-complete-frame.xml file using the fmd.xsd schema. This step checks all relationships of the first and second type contained within the functional metadata document. Finally, the third relationship type is validated using the fmd-sem.xslt transformation. This transformation uses a set of patterns to validate the functional metadata frame. The final result of this validation chain is an XML document containing error and success codes. The listed codes describe the type or error that occurred. All schemas and stylesheets described in this paper can be found at

Figure 2: XSLT Validation chain.


In this poster we present the preliminary results of a hybrid validating method. XML Schemas are used to validate the document structure and the uniqueness constraints, whereas XSL transformations are used to validate the conditional constraints. A transformation and validation chain were presented converting an MPEG-21 document into a valid functional metadata document. Since the W3C schema language did not suffice to validate the complete document, an extra validation step was introduced.


  1. "Text of ISO/IEC 15937-5 FDIS Information Technology Multimedia Content Description Interface Part 5 Multimedia Description Schemes," Oct. 2001, N4242.
  2. Boris Rogge, Rik Van de Walle, and Ignace Lemahieu, "MPEG-7 based dynamic metadata," in IEEE International Conference on Multimedia and Expo, Tokyo, Japan, Aug. 2001, IEEE, pp. 165-168.
  3. Boris Rogge, Rik Van de Walle, and Ignace Lemahieu, "Functional Metadata," in International Conference on IT and Telecommunications, Denver, CO, USA, Aug. 2001, PSIE, pp 63-70.
  4. Todd Schwartz, Vaughn Iverson, Young-Won Song, Rik Van de Walle, Doim Chang, and Ernesto Santos, "MPEG-21 Digital Item Declaration FCD", MPEG, Dec. 2001.
  5. David C. Fallside, "XML Schema Part 0: Primer," W3C,, May 2001.
  6. Henry S. Thompson, David Breech, Murray Maloney, and Noah Mendelsohn, "XML Schema Part 1: Structures," W3C,, May 2001.
  7. Paul V. Biron and Ashok Malhorta, "XML Schema Part 2: Datatypes," W3C,, May 2001.
  8. James Clarck, "XSL Transformations XSLT version 1.1", W3C,, Aug 2001.