Volume 1, Issue 1 - January 2001
   
 

Answers to Your Questions About VoiceXML

By Jeff Kunins

In this monthly column, an industry expert will answer common questions about VoiceXML and related technologies. Readers are encouraged to submit questions about VoiceXML, including development, voice-user interface design, and speech technology in general, or how VoiceXML is being used commercially in the marketplace. If you have a question about VoiceXML, e-mail it to and be sure to read future issues of VoiceXML Review for the answer.

For this first issue, we'll tackle some come common misconceptions about VoiceXML and voice technology.

Q: Isn't VoiceXML still immature? Is it ready for "prime time" deployment in the enterprise?

A: VoiceXML is a proven technology currently being used to deliver voice services such as
1-800-555-TELL, Shoptalk, Indicast, BeVocal and others. Ready for enterprise deployment, VoiceXML combines proven technologies in speech recognition, telephony, and Internet services. Companies such as IBM, Motorola, Lucent and AT&T have been honing the concept of simple declarative markup languages for speech applications for more than five years. Similarly, the Internet has achieved commonplace acceptance in the enterprise market and will process over $30 billion in secure transactions this year. There is an overwhelming trend in enterprise deployments to more deeply embrace technologies such as XML, XSL, and HTTP as the universal transport. This paradigm allows companies to preserve flexibility and work more efficiently by cleanly separating data from the user interface. Developers build shared business logic once, then use standardized markup languages such as HTML (Web), VoiceXML (speech), and WML (WAP) to create the appropriate user interface for each device. VoiceXML is simply a commitment by technology leaders to adopt a universal open standard for Internet-powered speech applications (see Figure 1) .

Figure 1: VoiceXML as a Universal, Open Standard

Traditional Web services use technologies such as Perl, ASP, or Java Servlets to dynamically generate HTML by executing database and application logic on the server. VoiceXML brings this paradigm to the IVR market, giving developers an easy, integrated way to extend their services to the phone. VoiceXML applications are literally a new set of "pages" on a Web site that happen to describe a conversation rather than a visual interface. Companies can leverage VoiceXML to make voice an integrated component of their mobile strategy, authoring shared business logic once and investing new effort only in the specific user interface for each device they support.

Q: Does VoiceXML support robust telephony applications or call center integration?

A: The full range of call center applications and IVR tasks in deployment today can be built using a VoiceXML platform. For example, 1-800-555-TELL is a sophisticated voice application built entirely using VoiceXML. More advanced features, such as blind and bridged call transfers, outbound notifications, and intelligent call routing and agent "screen-pop" through integration with CTI middleware, can all be supported using a combination of VoiceXML and accompanying platform services also built using open Internet standards.

The key lens through which to consider this issue is the analogy of the Web and HTML. HTML is explicitly designed to specify the visual presentation of an interactive application via the PC. Therefore, it contains appropriate constructs for tasks such as table layout, form input, and embedded images. It does not cover tasks such as database access, credit card processing, or personalization, which are typically handled by code running on the Web server. Products such as Microsoft Passport™ expose simple APIs to advanced shared services using open standards such as HTTP, cookies, and SSL.

VoiceXML plays exactly the same role in the world of Internet-powered voice applications. VoiceXML has intrinsic constructs for tasks such as dialogue flow, grammars, call transfers, and embedding audio files. It even supports Voice over IP-based call transfers through the Session Initiation Protocol (SIP). Companies can build upon this foundation to provide additional shared services such as outbound notifications and call center integration through simple URL-based APIs analogous to Microsoft Passport™ or affiliate programs from Amazon and MapQuest.

Q: Why do open standards matter for speech applications?

A: Traditional speech recognition and IVR platforms lock businesses in to closed, proprietary APIs that have little or no cross-vendor portability or integration options. Paired with the almost legendary complexity and non-interoperability of telephony systems, this has been a primary contributing factor to the slowness with which, until now, major corporations have adopted speech technology.

Open or de-facto industry standards have repeatedly proven to be the key to broad enterprise adoption of new technologies, because open standards protect investments and preserve flexibility in a multi-vendor environment. In the database, operating system, languages, mail, and directory server markets, and more recently with the explosion of Web-related technologies such as HTTP, SSL, cookies, and XML, demonstrable examples of the benefits of open standards can be found again and again.

For VoiceXML, the benefits of standardization are particularly strong. Because it is a thin layer that sits on top of the entire existing Web technology stack, it inherits complete and immediate interoperability with all existing infrastructure, software, and other standards that have been built to make enterprise Web deployments practical and efficient. Examples include security (SSL, VPNs, cookies), application servers (Java Servlets, Perl, IBM WebSphere™, Microsoft Active Server Pages™), data abstraction (XML, XSL), database connectivity (ODBC, SQL), and streaming media (WAV, Real, MP3).

Open standards also create a healthy and interoperable ISV market for development tools, application templates, enterprise software integration, and other complementary technologies and services. Companies that invest in open standards can leverage these opportunities; those that do not often find themselves tied to isolated, non-interoperable systems that have no cost-effective upgrade path to keep pace with the market.

Rarely has a new standards effort received such broad and immediate adoption as VoiceXML. In less than six months, every relevant vendor has joined the VoiceXML Forum and committed to deploying VoiceXML-powered systems. As of December 2000, at least eight VoiceXML platform implementations are commercially available. Businesses that wish to place their bets alongside all global leaders in speech, telephony, and Internet technologies should clearly embrace VoiceXML as their platform for voice application development.

Continued...

back to the top

 

Copyright © 2001 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE Industry Standards and Technology Organization (IEEE-ISTO).