spacer
XML Gov Logoflag
spacer

XML.Gov Demos: Per Diem Rates as XML Schema

by Ken Sall, SiloSmashers

Purpose: This demo shows how DoD Per Diem rate tables can be easily converted to XML, how a simple XML Schema can be created to validate per diem data, how the XML can be displayed in browsers using XSLT stylesheets, and how slight variations in stylesheets can produce differently sorted tables. (Future work may show how per diem data can be treated as a Web service.) Both server-side and client-side transformation of XML data are illustrated. The same XSLT stylesheets were used for both server-side and client-side transformations. Note: Browsers are slow in rendering large tables; this per diem data results in 600- to 1100-line tables, please be patient as the pages load.

Note: For client-side transformation, Internet Explorer 5.x or higher or Netscape 6.x are required. This is therefore not the recommended solution when your user base may favor older browsers.

File(s) Explanation
Data Source [zip]
(data captured May 3, 2003)
The DoD Per Diem Rates page includes ASCII semi-colon delimited rate tables. This demo uses the zip file available from the link in the left column which has three different sets of data described by this file structure. The 3 data files, each with 10 columns of data, are:
  • Connow.txt (Includes Military Installations; 1136 entries)
  • Conusnm.txt (No Military Installations; 690 entries)
  • Conusmil.txt (Military Installations only; 446 entries)
Note that Connow.txt = Conusnm.txt + Conusmil.txt. Therefore, Connow.txt is the complete per diem rate data.
Connow.xml
Conusnm.xml
Conusmil.xml
XMLSpy was used to generate XML for each of the above 3 rate tables. Column names were used as element names. The element names and their relationships are illustrated in this generated diagram. (Use Internet Explorer, rather than Netscape, to view these raw XML files.) The XML files are over 5 times as large as the corresponding text file; for example, Connow.xml is 447 kb, while Connow.txt is 83 kb. However, XML compresses very well, so Connow.xml is only 23 kb compressed (compared to 17 kb for Connow.txt compressed).

If this were to be done as a Web service, a small and relatively simple program could do what XMLSpy did, reading in the text files and writing out an XML data representation. The Web service would request the text data files from the DoD Per Diem site, and the DoD would respond with the text files. Better yet, DoD could send back the XML versions as shown on the left based on the agreed-upon XML Schema shown below.

Generated XML Schema A first-cut XML Schema was generated by XMLSpy using Connow.xml as the sample instance. The generated version contains enumerations rather than new types. (Use Internet Explorer, rather than Netscape, to view these raw XML files.)
Per Diem XML Schema The second-cut is an improvement over the generated version mainly by introducing a few reusable data types. This version is 3 kb compared to the 13 kb generated version. (Use Internet Explorer, rather than Netscape, to view these raw XML files.)
Invalid Data Interestingly enough, by introducing the XML Schema, several irregularities in the DoD per diem rate data were detected. Three of the military locations had no County designation, although they did contain the semi-colon delimiter, indicating no value. Rather than making the County element optional (since it does occur in 1133 of the 1136 cases), we have inserted the words "missing-county-data" in our XML for those 3 cases. However, another data file actually omitted the semi-colon separator in 3 County cases, meaning that certain rows had only 9 columns instead of the expected ten. The words "missing-county-data" were inserted for these cases as well. Finally, only one of the 1136 cases included a floating point (decimal value 70.25) rate; all 1135 others were integer (whole number) rates. Therefore, we adjusted the XML Schema to require a double value (either decimal or whole number) to handle this data irregularity. This shows the usefulness in XML Schema in detecting invalid (or at least irregular) data.
Connow.xml + XSLT
Conusnm.xml + XSLT
Conusmil.xml + XSLT
These are client-side transformations of the XML to HTML using XSLT. This means that your browser is sent XML data which refers to an external XSLT stylesheet (same one for all 3 data files). The stylesheet is applied to the data on the fly by your browser (client), resulting in HTML display. Note that the URL that appears in the address area is XML, not HTML, proving that your browser is handling XML. Because the data files are large, this may take up to 45 seconds for your browser to perform the data transformation and display the resulting table, so please be patient. Note: Either Netscape or Internet Explorer can be used for this.
Connow.html
Conusnm.html
Conusmil.html
These are server-side transformations of the XML to HTML using XSLT. The same stylesheet was applied to 3 different XML data files, resulting in 3 different HTML tables. Although the results are the same as the previous client-side case, the transformations were run beforehand, so only HTML needs to be displayed. This is somewhat faster, but is still bound by table-rendering speed. The disadvantage of this technique is that it would be a batch process to generate the HTML on the server, in contrast to the real-time, dynamic transformations in the client-side case. Note: Either Netscape or Internet Explorer can be used for this.
Connow.xml State Sort
Connow.xml Locality Sort
Connow.xml Max Per Diem Sort
These server-side transformations of the XML to HTML using XSLT illustrate 3 different sorts of the same XML data (Connow.xml). This was accomplished with 3 slightly different XSLT stylesheets, each with different sort criteria. A red border indicates the current sort criterion; 2 other column headers have links so you can switch to different sorts. For example, the Max Per Diem sort indicates that the 5 highest rates are in Manhatten, Ocean City, Vail, Boston, and Cambridge. Note: Either Netscape or Internet Explorer can be used for this. It would be possible to perform these sorts in real-time, using client-side transformations so the work is done in your browser. This would be a little slower but not much.

In the File column above, the term "server" means the file was (previously) transformed server-side (in this case by the Saxon XSLT processor). The transformed result is served to the browser as HTML (or text). Results should be invariant across all browsers, including 3rd and 4th generation.

The term "client" means the file is sent directly as XML to the browser with an external reference (i.e., a Processing Instruction) to an XSLT stylesheet residing on the server. The XML file is therefore transformed client-side by the browser to HTML (or text).

View Source in your browser for the client-side examples to prove to yourself that the browser is pointing to XML which is being transformed on the fly to HTML by the reference to the external XSLT stylesheet. In other words, the URL in your Location or Address bar is an XML file, although you see HTML in the browser window. However, when you View Source, it is actually XML, not HTML.

Valid XHTML 1.0! Valid CSS!

Return to Demo Home

Last Updated: May 4, 2003

Copyright © 2003 Kenneth B. Sall. All Rights Reserved.