CMIL and Metaspace: Visualizing Hypermedia with Contextual Metadata

Richard B. Dietz
Institute for Communication Research, Indiana University, USA


Recently there has been increased interest in the use of contextual information (in particular, location) in computer applications. Context-aware mobile, handheld and wireless devices have begun to proliferate. But the use of context data is not limited to hardware devices; contextual media (or context-encoded media) has great potential to enhance our digital canvas as well. In this poster, we outline one such contextual media application native to the internet. We introduce CMIL, Contextual Media Integration Language [1], a prototype visual CMIL user agent, and the internet application they make possible, Metaspace.

This work draws from many converging areas of inquiry in computer and information sciences. Among them are context-aware systems, information visualization and structured documents. This work has been particularly informed by research in location-specific information [2], novel context-encoded media capturing systems [3], and new media-centric models for internet content delivery such as SMIL [4].

Contextual Media Integration Language

CMIL is an XML 1.0 [5] tag language which is designed to associate contextual metadata with internet media. These metadata serve as categorical or continuous dimensions along which media resources such as images and text may be displayed by a CMIL visualization agent. CMIL documents, called "metascenes" or more simply "scenes", aggregate digital media resources from the internet and specify contextual metadata such as physical (or fanciful) spatial locations, time of creation or duration, cardinal orientation, associated hyperlinks, authorship, etc. The linked resources may be digital images, audio files, video, text and HTML documents. A CMIL user agent maps these media spatially by the values of their contextual metadata attributes. We call this internet application and the global aggregation of CMIL media and metascenes, Metaspace. An excerpt from a simple CMIL document is presented in Figure 1.

<?xml version="1.0" encoding="UTF-8"?>
      <title>Bloomington Community Farmers' Market</title>
      <node class="circle" id="node01" title="At the Fountain" >
            <loc coords="39.1697,-86.5363,771" datum="lla"/>
            <orient bearing="s"/>
            <time begin="19990612T103020"/>
            <att name="temperature" content="29.4" metric="celcius"/>
            <image src="media/images/little_nipper.png"/>
            <audio src="media/audio/kids_playing.aiff"/>
            <text>There's a strange man in my way.</text>
      ... additional node elements ...
Figure 1: An excerpt from a CMIL document. The "node" element brings together digital internet media and contextual metadata (attributes).


A prototype CMIL user agent named Shakti was developed to explore the overall concept and to test and refine the language. CMIL metascene documents may be retrieved by user agents locally or from any server over HTTP. The interaction paradigm differs significantly from that of traditional hypertexts; instead of a page-oriented view, the organization of contextual media is spatial and user-driven. Multiple metascene documents may be loaded and displayed simultaneously, and when mapped to geophysical coordinates, a user agent may retrieve underlying map data from an internet map server to provide a backdrop.

In Practice

The metascene in Figure 2. is a record of a multimedia recording session at a local farmers' market. In this instance, the nodes (circles) essentially represent multimedia snapshots that have been localized and frozen in time, what we call vivagraphs. Figure 2. shows two views of the Shakti user agent and an open vivagraph.

Figure 2: A spatial (left) and temporal (right) view of the same CMIL metascene. The media specified in this particular open node (center) consist of a digital image, an ambient audio file and inline text as specified in the "node" element in Figure 1.

Metaspace provides a way to enhance digital Internet media with a sense of context. It is a spatial information environment, the character of which is conveyed by the media present within it. The generic nature of CMIL allows for a variety of contextual media applications such as multimedia snapshots (vivagraphs), new forms of personal media sharing and interaction, as well as commerce, entertainment, and information visualization applications.

Future Developments

Currently, the Shakti user agent supports only a minimal subset of the current CMIL specification. Future work will focus on refining CMIL, developing a companion style language, more robust support for CMIL by the prototype user agent and addressing oustanding usability concerns.


  1. R. B. Dietz, Contextual Media Integration Language (CMIL 0.9),
  2. J. C. Spohrer, Information in Places, IBM Systems Journal Vol 38, No. 4 - Pervasive Computing, p. 602-628.
  3. H. D. Wacter, et al., Informedia Experience-on-Demand: Capturing, Integrating and Communicating Experience across People, Time and Space.
  4. P. Hoschka, editor, Synchronized Multimedia Integration Language (SMIL) 1.0 Specification
  5. T. Bray, J. Paoli, C.M. Sperberg-McQueen, editors, Extensible Markup Language (XML) 1.0,