Volume 3, Issue 5 - September/October 2003
 
   
 

In this monthly column, an industry expert will answer common questions about VoiceXML and related technologies. Readers are encouraged to submit questions about VoiceXML, including development, voice-user interface design, and speech technology in general, or how VoiceXML is being used commercially in the marketplace. If you have a question about VoiceXML, e-mail it to and be sure to read future issues of VoiceXML Review for the answer.

Q: How can I check the syntax of my VoiceXML documents before I attempt to execute them?

A: A VoiceXML document is a specialized XML document, that is, an XML document that conforms to a specific document type definition (DTD) and schema.
Numerous tools are available to XML developers to validate XML documents, ensuring that they are syntactically correct and that they conform to a
specified DTD or schema.

All XML tools are built on top of an XML parser. At the most basic level, the parser reads in the text of a document and verifies the following:

1. All beginning tags have a matching end tag in the same case.
2. Tags may not overlap (e.g. <b><i>whacky</b></i>).
2. All attributes have an associated value, and the value is quoted.
3. A tag has no duplicate attributes.
4. The document contains a single root document element.
5. Reserved characters such as "<", ">", and "&" are expressed as entities (&lt;, &gt;, &amp;) or are protected in a CDATA section.

Given the following sample document, let's see how Expat, the mother of all XML parser behaves.

<vxml version="2.0"
     xmlns="http://www.w3.org/2001/vxml">
<form>
     <block>
          Hello, world.
</form>
</vxml>

Here's the output from expat:

mismatched tag at line 6, column 2, byte 101 at expat2.pl line 4

That's because expat encountered a closing "form" tag instead of the expected closing "block" tag, thereby violating our first rule.

If you want to play with expat yourself, you can find it at http://www.libexpat.org/. If you are not fluent in C, most languages include support for parsing XML, either natively or through an external library. For example, Perl includes a module, XML::Parser::Expat, that interfaces with Expat.

Here's some rudimentary code that imports and utilizes that module for basic syntax checking:

#!/usr/local/bin/perl -w
use XML::Parser::Expat;
use strict;

use constant USAGE => "Usage: validate.pl <xml>\n";

if (@ARGV < 1)
{
   print USAGE;
   exit(0);
}

my $p; my $ret;
eval {
   $p = new XML::Parser::Expat();
   $ret = $p->parsefile($ARGV[0]);
};

if ($@)
{
   print STDERR $@;
}

Documentation for XML::Parser::Expat can be found at:

http://www.perldoc.com/perl5.6.1/lib/XML/Parser/Expat.html

Here's the equivalent in Java:

import javax.xml.parsers.*;
import org.xml.sax.helpers.*;
import org.xml.sax.*;

public class validate extends DefaultHandler
{
   public static final String USAGE = "Usage: java validate <xml>";

   public void validate() {}

   public static void main(String [] args)

   {
      if (args.length < 1)
      {
        System.err.println(USAGE);
        System.exit(0);
      }

      try {
        SAXParserFactory factory = SAXParserFactory.newInstance();
        SAXParser saxParser = factory.newSAXParser();
        saxParser.parse(args[0], new validate());
      }
      catch(Exception ex)
      {
        System.err.println("Error: " + ex);
      }
   }
}

If you compile and run this code using the Java Development Kit (JDK) and Java Runtime Environment (JRE) 1.4 or later
and pass a file containing the VoiceXML document listed above as the first argument, you'll get the following:

Error: org.xml.sax.SAXParseException:
The element type "block" must be terminated by the matching end-tag "</block>".

To learn more about Java's XML support, see http://java.sun.com/xml/

If you develop on the Microsoft Windows platform, and you have Microsoft Internet Explorer installed, you can put the following JavaScript code in a text file with a .js file extension, for example, validate.js:

var USAGE = "Usage: cscript validate.js <xml>";

var args = WScript.Arguments;
if (args.length < 1)
{
   WScript.Echo(USAGE);
   WScript.Quit();
}

var objParser = WScript.CreateObject("MSXML.DOMDocument");
objParser.async = false;
objParser.load(args(0));
var objErr = objParser.parseError;
if (objErr.errorCode != 0)
{
   WScript.Echo("Line " + objErr.line + ": " + objErr.reason);
}

To learn more about Microsoft's XML parser, MSXML, search for "MSXML SDK" on MSDN (http://msdn.microsoft.com/).

To learn more about the Windows Scripting Host (WSH),
search for http://msdn.microsoft.com/library/en-us/script56/html/wsconwshbasics.asp.

Continued...


back to the top

Copyright © 2001-2002 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).