Welcome to First Words, VoiceXML Review's column that teaches you about VoiceXML and how you can use it. We hope you enjoy this first lesson.
VoiceXML is a technology that brings a number of useful capabilities together into the Web development space. These capabilities can be used to build rich user experiences that allow callers to access information and transaction services through a telephone. VoiceXML ties these capabilities together with a markup language that is an XML derivative.
Some of the capabilities of VoiceXML include the following:
Telephony dialog control;
Automatic Speech Recognition (ASR);
DTMF (touch-tone) keypad recognition;
Text-to-speech (TTS) playback; and
Pre-recorded audio playback.
Although VoiceXML is often demonstrated by providing access to Web-based information content, the most powerful relationship to Web technology is that dynamic content generation technologies such as CGI, ASP, JSP and others can be used to construct and deliver personalized VoiceXML pages from a regular Web or Application server.
This article will focus on static VoiceXML pages in order to demonstrate some of the concepts, but readers should keep in mind that they can also use their favorite server-side technologies to build compelling dynamic applications.
Your First VoiceXML Application
If you develop software, you're probably familiar with the venerable 'Hello World' demonstration. Example 1 is a 'Hello World' program that we shamelessly borrowed from the VoiceXML Specification.
Example 1 will answer your telephone call, use text-to-speech to say "Hello world" to you, and then hang up. Not terribly exciting, but there are a few interesting things to note:
The document is obviously well-formed, using opening and closing XML-type tags;
We didn't have to do any extra work to play our message via TTS;
Comments are wrapped with "<!--" and "-->".
If you would rather provide pre-recorded audio to the user, then Example 1 would change to something resembling Example 2: