spacer
XML Gov Logoflag
spacer

XML.Gov Demos: XHTML Processed by XSLT

by Ken Sall Consulting

Purpose: This XHTML and XSLT Efforts Sorting demo demonstrates how converting HTML pages to XHTML has benefits beyond validation. Since XHTML is an XML dialect, XSLT stylesheets can process XHTML to extract and manipulate the elements of interest.

For this demonstration, the inner table of the original page that lists XML government efforts is processed, while the rest of the XHTML page is ignored. The table entries are sorted both by effort name and by organization name. (Although validation is possible, it is not important to this demonstration.) The fact that XSLT can operate on XHTML input to extract portions of a page (the inner table, in this case) is a big win over plain HTML input.

File Explanation
Original HTML Page This is a local copy of the XML.Gov Efforts page before any modifications. The images are missing since they were not copied to my server. (To validate the original page, go to http://validator.w3.org/, paste in the URL http://xml.gov/efforts.htm, and select "HTML 4.01 Transitional".)
Base Added This version differs from the original only by the addition of a <base href="http://xml.gov/"> reference so that the images would be visible during later stages.
Tidy Version Next step is to convert the file from HTML to XHTML using HTML Tidy. Although the result looks very much the previous one in your browser, the conversion results in source that is significantly different from the original. (We can validate it by going to http://validator.w3.org/, pasting in the URL http://kensall.com/gov/efforts/efforts-tidy-no-ns.xml, and selecting "XHTML 1.0 Transitional".)
Organization Sort HTML output resulting from server-side transformation of the XHTML produced by Tidy. The XSLT stylesheet sorts data alphabetically by Organization.
Effort Sort HTML output resulting from server-side transformation of the XHTML produced by Tidy. The XSLT stylesheet sorts data alphabetically by Effort name.

One additional feature of this demo is that while we are processing the XHTML, we can detect certain irregularities in the input and fix them. For example, the original page uses both relative and absolute URLs for the XML.Gov internal links. We can add a few lines to the XSLT stylesheet to remove the references to the site's URL (but not to external links) and generate an XHTML base reference (just so that the links will work on this demo site).

Potential Enhancements: (1) Create a DTD or XML Schema that supports the efforts registration form; (2) Create another XSLT stylesheet that takes form results (as XML) and produces CSS-styled, tabular HTML output.

Valid XHTML 1.0! Valid CSS!

Return to Demo Home

Last Updated: July 31, 2002

Copyright © 2002 Kenneth B. Sall. All Rights Reserved.