Users are increasingly accessing the Internet from information appliances such as PDAs, cell phones, and set-top boxes. Since these devices do not have the same rendering capabilities as desktop computers, it is necessary for Web content to be adapted, or transcoded, for proper presentation on a variety of client devices. In this paper, we propose an annotation-based system for Web content transcoding. First, we introduce a framework of external annotation, in which existing Web documents are associated with content adaptation hints as separate annotation files. We then explain an annotation-based transcoding system with particular focus on the authoring-time integration between a WYSIWYG annotation tool and a transcoding module. Finally, after giving an example of content adaptation using a page fragmentation module for small-screen devices, we compare our approach with related work.
As more and more Web-enabled personal devices are becoming available for connecting to the Internet, the same Web content needs to be rendered differently on client devices, taking account of their physical and performance constraints such as screen size, memory size, and connection bandwidth. For example, a large full-color image may be reduced with regard to size and color depth, removing unimportant portions of the content. Such content adaptation, also called transcoding, is exploited for either an individual element or a set of consecutive elements in a Web document, and results in better presentation and faster delivery to the client device. Content adaptation is thus crucial for transparent Web access under different conditions, which may depend on client capabilities, network connectivity, or user preferences [4,8,15].
Although most existing HTML  documents are created to be displayed on desktop computers, they can be augmented with meta-information to facilitate adaptation of their contents to other types of client device. It is important to note here that the result of applying an annotation to a document depends on the transcoding policy. The role of annotations is to provide hints that enable a transcoding engine to make better decisions on the content adaptation. To put it another way, annotation plays the role of a mediating representation, which provides semantics to be shared between meta-content authors and a content adaptation engine. A potential advantage of an annotation-based transcoding approach is the possibility of content adaptation based on semantics. This cannot be achieved with existing commercial products, which adapt contents on the basis of Web document syntax .
Annotations can range from simple to complex descriptions. A simple annotation, for example, specifies an importance value for a document element, and an element with lower importance may be ignored when display space is limited. In a complex case, an annotation may consist of alternative image contents for different device types and perhaps for different user preferences. It is possible for such an annotation to be embedded into an HTML document, as the value of an extra attribute that would be a proprietary extension of an HTML tag. Web content transcoding, however, usually results in the adaptation of content presentation to device capabilities. Therefore, mixing of contents and adaptation hints would not be acceptable according to the design consideration that separates content from presentation .
Markup languages such as HTML embed annotations into documents. For example, an <ol> tag indicates the start of an ordered list, and a paragraph begins with a <p> tag. On the other hand, annotations can be external, residing in a file separate from the original document. It would be impractical to incorporate meta-information for content adaptation into each of existing HTML documents, in terms of both the difficulty of changing the established HTML specification and the need to modify a huge amount of the existing documents. External annotation is thus a key to adaptation of Web documents to various constraints stemming from user preferences, client devices, media types, and so on. Although making annotations external may require additional bookkeeping tasks, it has the substantial advantage of not requiring any modification of existing Web documents that are already published as HTML or XML files.
Various demands for content annotation are emerging, such as Web accessibility , speech-enabled applications on the Internet , language translation, and document summarization. In this paper, we focus on page fragmentation for small-screen devices. In the following sections, we elaborate a framework for external annotation. We then introduce an annotation-based transcoding system developed on top of a programmable proxy server , and explain the integration between a WYSIWYG annotation tool and a transcoding module at the time of content authoring. Finally, after giving an example of content adaptation using a page fragmentation module, we compare our approach with related work.
This section describes an external annotation framework that prescribes a scheme for representing annotation files and a way of associating original documents with external annotations. The role of annotation is to characterize ways of content adaptation rather than to describe individual contents themselves. Therefore, the framework needs to specify a vocabulary for constraining the possibilities for decomposition, combination, and partial replacement of contents. The annotation vocabulary is briefly introduced in the following subsection. The basic ideas behind this annotation framework are twofold. One is that new elements and/or attributes should not be introduced into a document-type definition of annotated documents. The other is that annotations need to be created for arbitrary parts of annotated documents.
This framework is motivated by the requirements of rendering already-published HTML documents on various types of Web-enabled personal devices. Although external annotation is a general concept that has many potential applications, we focus on annotations of HTML documents that facilitate contents adaptation for personal computing devices. Figure 1 depicts several paths from an original HTML document to different client devices. An HTML document, which is provided for desktop computers (path 1), is analyzed and annotated with a separate file by using an annotation tool (path 2). The annotated document must be viewable in a normal browser on a desktop computer (path 3). Furthermore, such an annotated document can also be authored by using a standalone editor (path 4). Upon receiving a request from a personal device, a proxy server may adapt the document on the basis of associated annotations (path 5). The rendered document is then downloaded to a client device with a small display (path 6).
External annotation files contain hints associated with elements in an original document. The Resource Description Framework (RDF)  is used as the syntax of annotation files. An annotation file, which is an XML document, therefore consists of a set of RDF descriptions. The RDF data model defines a simple model for describing relations among resources in terms of named properties and values. In the process of document transcoding, it is necessary to exploit user preferences and device capabilities for content adaptation. Such information profiles can be described by using Composite Capability/Preference Profiles (CC/PP) . By using the RDF data model, CC/PP specifies client-side profiles, which can be delivered to a proxy server over HTTP . Furthermore, it is currently investigated ways of describing document profiles so that requirements for desired rendering can be clarified , and RDF is being employed for encoding the conformance profiles. Taking account of these standardization activities, it is reasonable for annotation vocabularies to be encoded in RDF, so that comprehensive content adaptation mechanisms can be pursued.
In addition, XPath  and XPointer  are used for associating annotated portions of a document with annotating descriptions. Figure 2 illustrates a way of associating an external description with a portion of an existing document. An annotation file refers to portions of an annotated document. A reference may point to a single element (e.g., an IMG element), or a range of elements (e.g., an H2 element and the following paragraphs). For example, /HTML/BODY/P points to the third P element of the BODY element of the annotated document. If a target element has an id attribute, the attribute can be used for direct addressing without the need for a long path expression. In contrast to XPath, which allows to select particular parts of a tree derived from elements or markup constructs of an XML document, XPointer makes it possible to select a range of elements by using the range expression.
When annotation files are stored in a repository, an appropriate annotation file for a Web document needs to be selected dynamically from the repository either implicitly by means of a structural analysis of the subject document or explicitly by means of a reference contained in the subject document or some other association database. An annotation file can be associated with a single document file, but the relation is not limited to one-to-one. It is possible for multiple annotation files to be associated with a single document file, when each annotation file contains descriptions related to different portions of an annotated document. On the other hand, a single annotation file may contain meta-information to be shared among multiple document files. This type of annotation would be useful when it is necessary to annotate common parts of Web documents, such as a page header, a company logo image, and a side bar menu.
The above framework prescribes a skeletal structure of annotation, without regard to the ways in which concrete adaptation hints are described. This section briefly explains an annotation vocabulary that is used for adaptation hints on rendering HTML documents for personal computing devices. The vocabulary includes three types of annotation: alternatives, splitting hints and selection criteria. A namespace  prefix, pcd, is used for the transcoding vocabulary. Further details on this vocabulary can be found in .
Alternative representations of a document or any set of its elements can be provided. For example, a color image may have a grayscale image as an alternative for clients with monochrome displays. A transcoding proxy selects the one alternative that best suits the capabilities of the requested client device. Elements in the annotated document can then be altered either by replacement or by on-demand conversion. The <pcd:Alternatives> tag specifies a list of alternative representations for an annotated element. The <rdf:Alt> tag provided by the RDF data model is used to specify alternatives to be included in the pcd:Alternatives element. Each item in the RDF containers (rdf:Alt, rdf:Bag and rdf:Seq) may include a pcd:Replace element.
An HTML file, which can be shown as a single page on a normal desktop computer, may be divided into multiple pages on clients with smaller display screens. The <pcd:Group> tag specifies a set of elements to be considered as a logical unit. Another use for the pcd:Group element is to provide hints for determining appropriate page break points. Alternatives may be provided for the group as a whole. For example, a news headline may be associated with an alternative for a news story that consists of paragraphs of text and some images. In the following example, the range of elements from the second occurrence of an H2 element through the second occurrence of a P element is annotated as a group.
<rdf:Description about="http://foo.com/catalog.html#xpointer(//H2 to //P)" > <pcd:Group /> </rdf:Description>
An annotation may contain information to help a transcoding proxy select from several alternative representations the one that best suits the client device. The selection criteria include the following information:
The <pcd:role> tag, for example, specifies the role of an annotated element. A transcoder may use this meta-information in order to make decisions on the allocation of client resources (display area, data volume for transmission, etc.). This role tag is provided with a value attribute, which may be specified as either proper content, advertisement, decoration, or icon, for example. The <pcd:importance> tag specifies the priority of an annotated element relative to the other elements in the page. When the importance of an element is low, for example, it will be ignored or displayed in a very small font. The importance value is a real number ranging from -1 (lowest priority) to 1 (highest priority). The default importance value is 0. When an importance is specified with a value outside the range, the default value, namely, 0 is used. By referring to this importance value, a transcoding proxy can make decisions on the allocation of client resources for each element. For example, an element may not be sent to a lightweight client, when the element is provided with a decoration role and a low importance value such as -0.2. The RDF description of such annotations is given as follows.
<rdf:Description about="http://foo.com/catalog.html#//IMG" > <pcd:role value="decoration" /> <pcd:importance value="-0.2" /> </rdf:Description>
Since content adaptation can be done on either a content server, a proxy, or a client terminal, a transcoding engine should not be forced to reside in any particular location. In order to resolve this limitation, a proxy-based approach has been adopted for content adaptation . Computational entities stay along the Web transaction path are called intermediaries , and existing approaches to annotation systems confirm to a common abstract architecture based on intermediaries .
As shown in Figure 3, intermediaries are entities that reside along the path of information streams, and facilitate an approach to making ordinary information streams into smart streams that enhance the quality of communication . An intermediary processor or a transcoding proxy can operate on a document to be delivered, and transform the contents with reference to associated annotation files. From a computational perspective, the use of an intermediary architecture is an approach to providing pluggable services between a Web client and server . To put it another way, intermediaries provide special-purpose proxy servers for protocol extension, document caching, Web personalization, content adaptation, and so on . This intermediary-based approach is suitable for realizing annotation-based content adaptation, because it allows us to provide a transcoding module as an intermediary without modifying Web browsers or content servers in any way.
As a transcoding platform, we use a programmable proxy server called Web Intermediaries (WBI) . WBI is a programmable processor for HTTP requests and responses. It receives an HTTP request from a client such as a Web browser, and produces an HTTP response to be returned to the client. The processing in between is controlled by modules or plugins available at an intermediary processor. WBI's plugin is constructed from three fundamental building blocks: Monitors, Editors, and Generators . Monitors observe transactions without affecting them. Editors modify outgoing requests or incoming documents. Generators produce documents in response to requests.
We realized a page-splitting module as a WBI plugin that adapts a requested document to the capabilities of a particular client (Figure 4). Adaptation hints are expressed by using the aforementioned transcoding vocabulary. The execution sequence of the page-splitting module for the first access to a Web document can be briefly described as follows:
Note that it is determined with reference to a URL and a session identifier whether an HTTP request is for the first access or not. Each anchor element linking to a fragmented page is provided with an href attribute value, in which a corresponding session identifier is included. Currently, a single annotation file is retrieved, but it can be extended with the capability of multiple annotation.
Annotation descriptions could be too complicated for a simple source tag editor to maintain, because addressing by XPath/XPointer follows a hierarchy of document elements from the root to a focal element, and alternative contents are structured as a hierarchy of conjunctive/disjunctive elements for replacement. Therefore, it is crucial to provide an annotation tool for the external annotation approach. We have developed such a tool  by extending an existing HTML authoring tool . The high-level design of the annotation tool is depicted in Figure 5. It consists of a WYSIWYG editor, a source tag editor, and a previewer. The WYSIWYG editor is used not only to modify a subject HTML document but also to specify a portion of the HTML document to be annotated. When the previewer is invoked, a transcoding proxy is called over HTTP and the corresponding annotation is applied to the subject document. An adapted document is then sent back to the previewer, in which the result is displayed. In this way, the annotation tool is fully integrated with the transcoding proxy, so that tool users can see the results of content adaptation and revise annotations on the fly.
Figure 6 show a screen copy of the annotation tool. The child window in the upper half is for WYSIWYG editing of HTML contents. When an element is selected in the WYSIWYG editor [Figure 6 (a)], a user can pop up a dialog window for annotation [Figure 6 (b)]. After the completion of annotation input, a DOM  for an XML annotation is internally modified or created for the first time. The region highlighted in reverse video in the lower-half child window [Figure 6 (c)] is an RDF description that annotates the role and importance of the header text "Information on TRL" found in the WYSIWYG editor.
This section shows the results of applying the page-splitting plugin to real-life Web contents. The Web page used as an example is a news page from a corporate Web site (Figure 7). Use of tables for page layout is inappropriate not only as regards a clear distinction between style and content, but also as regards Web content accessibility. According to the accessibility guidelines 12 to 14 , content developers are encouraged to make contents navigable. In reality, however, there are a large number of Web pages in which table elements are employed for layouting. The news page consists of three tables stacked from top to bottom as depicted in the right in Figure 7. The top and middle tables correspond respectively to a header menu and a search form. The bottom table, [labeled as "Layouter (3)" in the figure], however, is used for layouting.
Figure 8 shows a portion of an annotation file to be associated with the news page mentioned above. The annotation contains RDF descriptions specifying the roles of the tables. For example, the first description in the figure is related to the top table, for a header menu ["Header (1)" in Figure 7]. According to the header role annotation, the page-splitting plugin handles the element that is to appear as a header of every fragmented page created from this news page. In addition, the importance value "+1.00" indicates that this element should not be omitted in any case. The last RDF description in Figure 8 concerns the left menu bar on the left ["Side bar menu (31)" in Figure 7]. Since the role of this element is annotated as auxiliary, this portion of the news page will be presented as a separate page upon receipt of a request from a small-screen device. On the other hand, the role of the bottom table in the news page is annotated as layouter. Therefore, the bottom table will not be retained in the display for small-screen devices.
Figure 9 illustrate how the news page will be fragmented in a small display. According to the capability for authoring-time transcoding, users of the annotation tool can check the result of page fragmentation on the fly, by simply switching to the previewer (Figure 10). A user can also change the size of the display area by clicking the button at the bottom right. According to the header role of the top table, the same header appears in the previewer as in the original page. The "Side bar" anchor in the center is created in accordance with the auxiliary role of the vertical side bar menu. In contrast, because the importance value is "-1.00," the search form table in the original page is omitted. The main news content then starts after the "Side bar" anchor, and allows users to access the primary content of the page directly.
Figure 11 shows the contents displayed in a PalmOS emulator and an HTML browser . Figure 11(a) shows the news page without fragmentation, presenting only the top one-ninth of the original content. In contrast, Figure 11(b) shows the result of fragmentation by the page-splitting plugin. It is important to note here that the page splitting not only reduces the content to be delivered, but also places the primary content near the top of the fragmented page that is provides with navigational features. This result of adaptation follows the design guidelines for reducing scrolling during interaction with small screens [11, p. 58]:
Small screens force users to employ frequent scrolling activities that may affect the accessibility of contents as well as the usability of devices. It has been reported that users with small screens were 50% less effective than those with large screens in completing retrieval tasks . Therefore, page fragmentation based on semantic annotation will be more appropriate than page transformation done by solely syntactic information, such as removing white spaces, shrinking or removing images, and so on. Semantic rearrangement is one of the critical limitations of the syntactic transformation approach. The navigational features achieved by this semantic annotation are noteworthy from the perspective of Web content accessibility.
Since external annotations must be updated whenever a subject document is revised, it is necessary to provide a way of keeping them synchronized. Our annotation tool is especially helpful when an annotated document can be revised in parallel with a corresponding annotation. If an element is removed from or added to the document during WYSIWYG editing, the annotation file is re-created and automatically adjusted for the revision of the corresponding annotating description and for the differences in the XPath specification. This consistency is achieved because annotation source tags are updated whenever a user switches to the source tag editor or previewer in the annotation tool. Annotation source tags are re-created from internal DOMs  for XML annotation by referring to annotating portions of a subject HTML document. In contrast, when an annotation author is not allowed to modify an original document at all, content synchronization can be weakly implemented for consistency checking. For example, by using a digest value (a hash value such as MD5  or SHA-1 ), it is possible to ensure that a subject document has not been changed. If the MD5 value of an entire subject file is included in an annotation file, a transcoding proxy can check whether a given file is an up-to-date version of the subject file.
An automatic re-authoring process has be proposed for device-independent access to Web contents . The re-authoring is conducted by a heuristic planner, which searches a document transformation space and selects the most promising state with the smallest display area. It is reported, however, that in the worst case the planner produces 80 versions of the document during the search process. If the planner is provided with meta-information (namely, an annotation), the search space will be pruned more effectively. To put it another way, it is not an issue of whether meta-information is embedded or external. The important point is that meta-information must be provided explicitly rather than implicitly in an adaptation procedure.
Meta-information is used in a multimedia presentation system , which creates a customized view of the presentation. In particular, a vocabulary of meta-information is defined there for synchronized multimedia content adaptation. The vocabulary provides ways of representing alternative contents, content descriptions, and content predicates. By using that vocabulary, meta-information can be associated with a subject document in two ways: as short inline descriptions or as RDF-like embedded descriptions. However, the vocabulary is defined as proprietary extensions to the HTML specification. In contrast, our external annotation approach allows such application-specific meta-information to be specified separately from the HTML specification.
Finally, we remark on the extensibility issue related to the employment of meta-information for content adaptation. Annotation-based transcoding is a way of realizing content adaptation, and a transcoding module employs external annotations, which are described in a markup language rather than a procedural programming language. On the basis of the same intermediary-based transcoding platform, it is possible to think about the other approaches. One is to provide a custom-tailored transcoding module that runs without any external meta-information. Another approach could be the case of using a general-purpose transformation engine, such as XSLT , which employs externally provided transformation rules, or XSL style sheets.
Figure 12 illustrates the three approaches mentioned above. Although these three are not exhaustive, they represent major design decisions necessary for the realization of a content adaptation system. The custom-tailored module relies on heuristics or empirical knowledge of the content to be adapted [Figure 12 (a)]. It can be applied without additional customization of the transcoding module. However, the scope of application must be carefully selected, because it will suffer a steep performance degradation when applied to domains that are not well suited to the system's purpose. In addition, modification using a conventional programming language requires considerably more effort than authoring in a markup language. Although it is possible to remedy the limitations of this approach by parameterizing the possibilities of customization, the scope of applicability is still limited. Bickmore's above-mentioned system falls into this category. In contrast to a custom-tailored module, a general-purpose transformation engine can be used so that contents can be adapted in various ways by means of application-specific transformation rules [Figure 12 (c)]. The transformation rules are assumed to be written in a markup language, and the rule authors rely on the solid basis of a generic transformation engine. The advantage of this approach lies in the clear distinction between the roles of those responsible for the development of a run-time module (namely, a generic transformation engine) and those responsible for the application-specific programming in a markup language.
The annotation-based approach discussed in this paper relies on a task-specific transcoding module, such as a page-splitting plugin, and employs declarative meta-information supplied externally [Figure 12 (b)]. The advantage of this task-specific approach over the use of a custom-tailored module lies in the extent of the customizability achieved by programming in a markup language without any modification of the task-specific module. On the other hand, the task-specific approach is relatively limited as regards customization using a markup language. However, its advantage over the approach of using a generic transformation engine is that the semantics are made explicit according to the features of the task at hand. At the same time, this design decision involves trading the scope of applicability in the generic approach for articulated semantics in the specification of the meta-information. In the case of the page-splitting plugin, roles such as header, auxiliary, and layouter supplement semantics that cannot be fully prescribed in the definitions of Web documents. In this sense, the importance of external annotation lies in the role of the mediating representation, which articulates the semantics to be shared between meta-content authors and a content adaptation engine.
We thank the following people who have contributed to the elaboration of the annotation framework and the transcoding vocabulary: David Fallside, Kazushi Kuse, Chung-Sheng Li, Hiroshi Maruyama, Rakesh Mohan, Bob Schloss, John R. Smith, and Naohiko Uramoto.
Masahiro Hori is an advisory researcher at IBM Tokyo Research Laboratory. He received his B.E. in Biophysical Engineering, M.E. and Ph.D. in Computer Science from Osaka University. His research interests include knowledge engineering methodologies, object-oriented software reuse, and Web content authoring and adaptation. Dr. Hori is a member of Industrial Advisory Board of the IBROW project conducted under the IST program of the European Commission. He received the Research Awards in 1992 and 1997 from Japanese Society for Artificial Intelligence (JSAI). He is a member of JSAI and Information Processing Society of Japan.
Goh Kondoh received his B.S. in Mathematics and M.E. in Computer Science from Keio University in 1996 and 1998, respectively. His research interests include web application development framework and tools. He is currently working at IBM Tokyo Research Laboratory. He is a member of Information Processing Society of Japan, and Japan Society for Software Science and Technology.
Kouichi Ono received his B.S.E. and M.S.E. degrees, both in electronics from Waseda University, Japan in 1987 and 1989, respectively. From 1990 to 1992, he was an assistant at the university. His research interests include formal development methods, object-oriented analysis/design, software development tools, and mobile agent programming. He is currently working at IBM Tokyo Research Laboratory. He is a member of the IEEE Computer Society.
Shin-ichi Hirose is an advisory researcher in IBM Tokyo Research Laboratory. He received his B.S. and M.S. in Information Science from Tokyo University. His research interests include object-oriented systems, software components, and knowledge-based systems. He is a member of Information Processing Society of Japan, and Japan Society for Software Science and Technology.
Sandeep Singhal is a Senior Technical Staff Member in IBM and Senior Architect in IBM's Pervasive Computing Division. He is responsible for middleware, server, and tools architecture, product definition, and technology development to enable information access from non-PC devices. Dr. Singhal is also an Adjunct Assistant Professor on the Graduate faculty at North Carolina State University in Raleigh, North Carolina. He holds M.S. and Ph.D. degrees in Computer Science from Stanford University, as well as B.S. degrees in Computer Science and in Mathematical Sciences and a B.A. in Mathematics from Johns Hopkins University.