VoiceXML Review - Columns - Speak & Listen

Volume 4, Issue 1 - January/February 2004

In this monthly column, an industry expert will answer common questions about VoiceXML and related technologies. Readers are encouraged to submit questions about VoiceXML, including development, voice-user interface design, and speech technology in general, or how VoiceXML is being used commercially in the marketplace. If you have a question about VoiceXML, e-mail it to and be sure to read future issues of VoiceXML Review for the answer.

Q. How do I make my voice application bullet-proof, providing my callers with application-specific prompts even in the event of an unexpected failure?

A. To make your application bullet-proof you need to understand how it can fail. These can be summed up as follows:

+ A network or Web server error.
+ A syntax or run-time error in your voice application code.

A first step, and by no means a trivial one, is to make sure your network is reliable. VoiceXML interpreters communicate with document servers via HTTP, and regardless of whether the interpreter running your voice application is located across the country or across a room, hardware can fail and connections can be severed. You'll need to check your purse strings to determine if your voice application is mission critical enough to mitigate these risks by hosting your voice application on multiple document servers in geographically redundant locations. These same requirements hold true for the VoiceXML interpreters and their associated telephony, recognition, and audio resources used to process your application.

Given network reliability, let's assume that the interpreter is able to fetch your voice application's start page, application root document, and any resources (e.g. scripts libraries, grammars)
referenced by these documents. You need to ensure that your VoiceXML code properly handles all runtime errors. Errors of this sort are typically HTTP or semantic errors, and the VoiceXML interpreter maps these errors to events which it throws to your application, for example, "error.badfetch.http.404" or "error.semantic". If you neglect to handle these events, the interpreter typically handles them in a generic fashion, sometimes by playing Text-To-Speech (TTS), or, worse yet, by hanging up on the caller. In a truly professional application, both are to be avoided, and it's easy to do so in your application root document.

The following application root document implements a number of event handlers. The first two are nomatch and noinput handlers. You typically handle these events local to the input item (e.g. field) where they are thrown so that you can provide context-sensitive prompts to aid the user.
If, however, one of your VoiceXML developers neglects to handle these events, it would be a shame to hang up on the caller, so you include some generic ones in the application root.

The next handler, connection.disconnect.*, handles situations where the session is terminated, either because the user hung up, was disconnected via the <disconnect/> element, or was blind transferred. The final handler is a "catch-all", that is, a handler that catches any event that is not specifically handled elsewhere in your application. This handler should always appear last in your application root for reasons described in 5.2.4 of the
VoiceXML 2.0 specification, "Catch Element Selection".

<vxml version="2.0">

<nomatch>
I'm sorry. I didn't get that.
<reprompt/>
</nomatch>

<noinput>
I'm sorry. I didn't hear you.
<reprompt/>
</noinput>

<catch event="connection.disconnect.*">
<log>caller hung up or was disconnected</log>
</catch>

<catch>
   <log>app catch-all caught <value expr="_event"/></log>
   I'm sorry. An unexpected error occurred. Please try again later.
   <disconnect/>
</catch>

</vxml>

Let's revisit network reliability. Specifically, what happens if the VoiceXML interpreter, upon answering a call, is unable to fetch the start page of your voice application. This can happen if the document servers hosting your voice application are unavailable or you have a coding error in the start page, the application root, or another resource referenced by these documents. In that case, it's up to the interpreter to provide default handling. Some interpreters may allow you to customize that default handling. Consult your VoiceXML platform documentation, or contact your provider to find out.

Q. Playing a custom prompt in the event of a failure is great, but hanging up on the caller
seems less than customer-focused. In the event of application failure, how do I properly transfer my user to a live agent?

A. As you know, the transfer element is a form item, so you'll need to navigate from the catch handler to a form where the transfer can be executed. Easy enough:

<vxml version="2.0">

<catch>
<log>app catch-all caught <value expr="_event"/></log>
<goto next="transfer.vxml"/>
</catch>

</vxml>

<vxml version="2.0">
<form id="xfer">
<transfer dest="" bridge="false"/>
</form>
</vxml>

What if, in spite of all your attempts at network reliability, the link to your document servers are down?
You won't be able to fetch another document containing the transfer form. You'll need to include that form in a document that's currently loaded. During execution of a voice application, two documents are typically loaded at any given time: the current leaf document and the application root document. Alas, you won't know ahead of time which of the leaf documents in your application will be loaded when the failure occurs, so you have a couple of options:

- Include a transfer form with a consistent name in every single leaf document.
- Include a single transfer form in your application root document.

The former seems like an onerous option to maintain, unless you're using a template engine to generate your pages, either on the server at run-time or at build-time before publishing your application to your document servers. The latter, on the other hand, seems straightforward and far easier to maintain. Before jumping to that conclusion, however, you need to keep the following in mind from 5.2 of the VoiceXML 2.0 specification:

- Event handlers are executed "as if by copy".
- "Relative URI references in a catch element are resolved against the active document and not relative to the document in which they were declared."

Although the catch-all handler appears in your application root document, it will be executed as if it were copied into the document where the unexpected event occurred. For example, the following is invalid:

<vxml version="2.0">

<catch>
<log>app catch-all caught <value expr="_event"/></log>
<goto next="#xfer"/>
</catch>

</vxml>

So, if the documents that comprise your voice application are organized in a complex directory hierarchy, you need to either use an absolute URL which has its own associated maintenance issues, or use relative paths and make sure you get the path to the application root right regardless of the leaf document that's loaded when the error occurs. If you keep all the VoiceXML document in your application located in the same directory as the application root for example, navigation is simple:

<catch>
<log>app catch-all caught <value expr="_event"/></log>
<goto next="application_root.vxml#xfer"/>
</catch>

If your developers insist on breaking up the modules of your application into separate directories, another strategy is to require them to include a consistently named variable at document scope in each leaf document of the application. The value of the variable must be set to the depth of the document from top-level directory of the application. Given the location of the application root document from the application's top-level directory, you can easily compose the path from any leaf document to the root.

<vxml version="2.0" application="../../application_root.vxml">
<var name="depth" expr="'../../'"/>
</vxml>

<vxml version="2.0">

<catch>
<log>app catch-all caught <value expr="_event"/></log>
<goto expr="('string' == typeof depth ? depth : '') + 'application_root.vxml#xfer'"/>
</catch>

</vxml>

Q. I've implemented a subdialog to obtain a credit card number, and I want to use it in multiple applications. Each application has its own error handling which includes transferring to a distinct agent. In addition, if the user hangs up in the middle of credit card collection, each application performs its own hangup post-processing. What's the most efficient way to handle unexpected errors and hangups from within the subdialog that is consistent with the handling implemented within the calling application?

A. You're probably aware that when a VoiceXML interpreter executes a subdialog, it creates a new context isolated from the application that called it. This means that, typically, you don't reference the application root document of your application from within the subdialog implementation. This maximizes reusability of the subdialog within multiple applications. It also allows your subdialog to execute more efficiently since it doesn't need to reload and reinitialize the application root document. By the same token, it also means that your subdialog doesn't inherit all the robust error handling you undoubtedly included in your application root or in the calling document. To answer your question, the most efficient way to handle unexpected errors in your subdialog is to return them and to let the caller deal with them. The following code illustrates:

Regardless of the event thrown by the interpreter, the catch-all handler returns it to the calling subdialog. You can, of course, handle the events your subdialog is prepared to handle by overriding the catch-all. In the following example, the subdialog explicitly handles nomatch and noinput events.

   <form id="get_credit_card">
     <field name="cnum">
       <prompt>Say your card number</prompt>
       <nomatch>
         I'm sorry I didn't get that.
         Please speak or touch tone your credit card number.
       </nomatch>
       <noinput>
         I'm sorry I didn't hear you.
         Please speak or touch tone your credit card number.
       </noinput>
     </field>

</form>
</vxml>

See the catch element selection algorithm described in section 5.2.4 of the VoiceXML 2.0 specification.

back to the top