Volume 3, Issue 2 - March/April 2003
   
 

OpenVXI: Fostering VoiceXML via Open Source

By Brian Eberman
Continued from page 2...
1.3 THE VOICEXML INTERPRETER

The VoiceXML interpreter provides support for the October 2001 draft of VoiceXML 2.0. As the document is parsed, these are converted into a common internal representation. The language continued to evolve as the interpreter was developed and tested. A few of the challenges are explored below.

The current language combines elements of declarative and procedural models resulting in a paradigm unfamiliar to many C / C++ / Java developers. Many elements are resolved in 'document order', so handlers may appear before or after the associated code. Others like embedded ECMAScript functions and expressions are procedural. Subdialogs even provide explicit parameter lists and return values. Event handlers are scoped 'as-if-by-copy' and by name. This differs from Java and C++ in that generic event handlers defined locally have lower precedence than specific handlers defined at a lower level.

The processing for subdialog and recognition return values added considerable complexity to the VXI. Subdialogs execute in an independent ECMAScript context, requiring that root documents and associated links be reprocessed (and potentially reloaded) and that the original state be preserved. Recognition result processing depends on which of the multiple parallel grammars produced the highest confidence hypothesis. This grammar may be defined at several levels within a document or in a separate document entirely. Because of this, the interpretation process following recognition may proceed from where the grammar was defined, not where the recognition occurred. This forces the VXI to hold the recognition result until the relevant part of the document is reinitialized.

A tight integration with ECMAScript is vital because of the explicit linkage between VoiceXML and ECMAScript. Many elements create ECMAScript variables or have attributes containing expressions to evaluate. VoiceXML requires a very specific chain of variable scopes. All scope chain management and ECMAScript variable management is delegated to the VXIjs JavaScript interface. This interface was also used to extend the ECMAScript type system to support audio variables stored as ECMAScript variables.

1.4 ADDING CALL CONTROL

The Call Control sub-group of the W3C voice-working group is moving forward with a call control markup language. The goal of this language is to enable conference calling services, personal assistant applications, and other applications requiring advanced call control. An example use case for a personal assistant might be having the assistant place a call and then interrupt the user when the call is set-up. The interrupt might be delivered while the user is listening to voice mail through a VoiceXML browser. The call control process, which is interpreting the call control markup language, is designed to run as a master of the VoiceXML interpreter. The call control process treats the VoiceXML interpreter as a potential end-point in call control legs.

As the call control mechanism is flushed out it will have an effect on VoiceXML implementations, but currently the impact is minimal. For the OpenVXI, the issue is that call control events, which can be generic events, must be delivered during one of the blocking function calls. For example, a "telephony.newcall" event might be a future event that must be delivered to the VoiceXML layer while the system is playing a long email through text-to-speech. In the OpenVXI, this might be handled by having the Wait function in VXIprompt or the recognition function return this event back to the VoiceXML layer. We are considering adding an event insertion mechanism to the interpreter so that the call control component can directly insert an event to be processed by the interpreter on the next form interpretation traversal.

2. CONCLUSION

The OpenVXI interpreter is a widely used interpreter both as a starting point for platforms and as a basis of experiments in multi-modal architectures and research systems. SpeechWorks is evolving a new version of the interpreter to support the VoiceXML candidate recommendation and important extensions for call-control and distributed implementations. SpeechWorks believes that with this implementation, OpenVXI will continue to play an important role in the adoption of VoiceXML.

back to the top

 

Copyright © 2001-2003 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).