|
In this monthly column, an industry expert will answer common questions about VoiceXML and related technologies. Readers are encouraged to submit questions about VoiceXML, including development, voice-user interface design, and speech technology in general, or how VoiceXML is being used commercially in the marketplace. If you have a question about VoiceXML, e-mail it to and be sure to read future issues of VoiceXML Review for the answer.
Q: How can I check the syntax of my VoiceXML documents before I attempt to execute them?
A: A VoiceXML document is a specialized XML document, that is, an XML document that conforms to a specific document type definition (DTD) and schema.
Numerous tools are available to XML developers to validate XML documents, ensuring that they are syntactically correct and that they conform to a
specified DTD or schema.
All XML tools are built on top of an XML parser. At the most basic level, the parser reads in the text of a document and verifies the following:
1. All beginning tags have a matching end tag in the same case.
2. Tags may not overlap (e.g. <b><i>whacky</b></i>).
2. All attributes have an associated value, and the value is quoted.
3. A tag has no duplicate attributes.
4. The document contains a single root document element.
5. Reserved characters such as "<", ">", and "&" are expressed as entities (<, >, &) or are protected in a CDATA section.
Given the following sample document, let's see how Expat, the mother of all XML parser behaves.
<vxml version="2.0"
xmlns="http://www.w3.org/2001/vxml">
<form>
<block>
Hello, world.
</form>
</vxml>
Here's the output from expat:
mismatched tag at line 6, column 2, byte 101 at expat2.pl line 4
That's because expat encountered a closing "form" tag instead of the expected closing "block" tag, thereby violating our first rule.
If you want to play with expat yourself, you can find it at http://www.libexpat.org/. If you are not fluent in C, most languages include support for parsing XML, either natively or through an external library. For example, Perl includes a module, XML::Parser::Expat, that interfaces with Expat.
Here's some rudimentary code that imports and utilizes that module for basic syntax checking:
#!/usr/local/bin/perl -w
use XML::Parser::Expat;
use strict;
use constant USAGE => "Usage: validate.pl <xml>\n";
if (@ARGV < 1)
{
print USAGE;
exit(0);
}
my $p; my $ret;
eval {
$p = new XML::Parser::Expat();
$ret = $p->parsefile($ARGV[0]);
};
if ($@)
{
print STDERR $@;
}
Documentation for XML::Parser::Expat can be found at:
http://www.perldoc.com/perl5.6.1/lib/XML/Parser/Expat.html
Here's the equivalent in Java:
import javax.xml.parsers.*;
import org.xml.sax.helpers.*;
import org.xml.sax.*;
public class validate extends DefaultHandler
{
public static final String USAGE = "Usage: java validate <xml>";
public void validate() {}
public static void main(String [] args)
{
if (args.length < 1)
{
System.err.println(USAGE);
System.exit(0);
}
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
saxParser.parse(args[0], new validate());
}
catch(Exception ex)
{
System.err.println("Error: " + ex);
}
}
}
If you compile and run this code using the Java Development Kit (JDK) and Java Runtime Environment (JRE) 1.4 or later
and pass a file containing the VoiceXML document listed above as the first argument, you'll get the following:
Error: org.xml.sax.SAXParseException:
The element type "block" must be terminated by the matching end-tag "</block>".
To learn more about Java's XML support, see http://java.sun.com/xml/
If you develop on the Microsoft Windows platform, and you have Microsoft Internet Explorer installed, you can put the following JavaScript code in a text file with a .js file extension, for example, validate.js:
var USAGE = "Usage: cscript validate.js <xml>";
var args = WScript.Arguments;
if (args.length < 1)
{
WScript.Echo(USAGE);
WScript.Quit();
}
var objParser = WScript.CreateObject("MSXML.DOMDocument");
objParser.async = false;
objParser.load(args(0));
var objErr = objParser.parseError;
if (objErr.errorCode != 0)
{
WScript.Echo("Line " + objErr.line + ": " + objErr.reason);
}
To learn more about Microsoft's XML parser, MSXML, search for "MSXML SDK" on MSDN (http://msdn.microsoft.com/).
To learn more about the Windows Scripting Host (WSH),
search for http://msdn.microsoft.com/library/en-us/script56/html/wsconwshbasics.asp.
Continued...
back to the top
Copyright © 2001-2002 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).
|