VoiceXML Review - Columns - Speak & Listen

Q. How does the inputmodes property effect the universals property in VoiceXML 2.0?
For example, if I set unversals to "all", and I set inputmodes to "dtmf", are voice universal grammars still active?

A. In the definition of the inputmodes property, the VoiceXML 2.0 specification states that the inputmodes property "does not control the activation of grammars". The specification goes on to state, however, that, when the property is set to "dtmf", a voice grammar "would not be matched, because the voice input modality is not active." Thus, if universals are active (they're disabled by default), a universal voice command would not be matched.

Q. I want to use the VoiceXML submit tag to post recorded audio data. What do I have to do in Java Servlets or Microsoft Active Server Pages (ASP) on the server side to be able to write the posted audio data to a file?

A. For such a seemingly simple task, server-side frameworks such as Servlets and ASP provide lots of flexibility in allowing you to access that data. In fact, aside from exposing the stream of uploaded bytes, they leave it entirely up to you to make sense of the data. RFC 1867 (http://www.ietf.org/rfc/rfc1867.txt), "Form-based File Upload in HTML", describes a mechanism for encoding a file so that it can be uploaded from a Web browser to an HTTP server. The encoding type, "multipart/form-data", is more generally described in RFC 2388 (http://www.ietf.org/rfc/rfc2388.txt).

Because a VoiceXML interpreter is essentially a Web browser, it employs the same Internet standards to
send data to an HTTP server. In fact, the VoiceXML elements used to send data to an HTTP server, submit and subdialog, support a special attribute, enctype, so that you can indicate the type of encoding that should be used. When sending audio data recorded using the record element, set the enctype attribute to "multipart/form-data", and set the method attribute to "post". Given these hints, the VoiceXML interpreter does the real work
of making sure the data passed via the namelist attribute of the submit or subdialog element is encoded correctly.

Here's a short example that uses the record element to record a greeting in the form item variable named recording. Once the recording is made, the submit element is used to post it to a script, upload.cgi, running on an HTTP server.

<vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">
<form id="record_and_submit">
     <record name="recording"
       maxtime="60s" dtmfterm="true" beep="true">
          <prompt>
               At the tone, please record your personal greeting.
               When you're done, press pound.
          </prompt>

Getting back to your question, once your voice application has submitted the audio data, you have a couple of options:

To access audio data using Java Servlets, you'll need to write a Servlet that parses the data out of the HTTP request. The Servlet API won't perform the parsing for you, but you can download a Java class that does. The class, MultipartRequest, written by Jason Hunter is available for download at http://www.servlets.com/cos/index.html. Complete documentation for the class is available at http://www.servlets.com/cos/javadoc/com/oreilly/servlet/MultipartRequest.html. Here's a small code example that shows you how to use it:

import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;
import com.oreilly.servlet.MultipartRequest;

public class UploadServlet extends HttpServlet {

     public void doPost(HttpServletRequest req, HttpServletResponse res)
          throws ServletException, IOException
     {
          // Get output stream.
          ServletOutputStream out = res.getOutputStream();

          // establish directory in which to save recordings and
               5MB upload limit
          MultipartRequest multi = new MultipartRequest(req,           "/var/myapp/recordings/", 5242880);

          // Send a response back to VXML client.
          res.setContentType("text/xml");
          out.println("<?xml version=\"1.0\"?>");
          out.println("<vxml version=\"2.0\" xmlns=\"http://www.w3.org/2001/vxml\"><form><block>");
          out.println("<prompt>Your greeting was saved.</prompt><exit/></block></form></vxml>");
          out.flush();
          out.close();
     }
}

When the servlet's doPost method is executed, it creates an instance of the MultipartRequest class, passing in a reference to the HttpServletRequest, a directory in which to store the uploaded file(s), and a maximum number of uploaded bytes allowed. This servlet doesn't do any error checking, and it simply sends a simple VoiceXML document back to the VoiceXML interpreter. The document plays a TTS prompt to the user indicating that the greeting was saved, and exits.

Various solutions have been developed for ASP developers. Your reusability options include purchasing a third-party COM/.NET component. Alternatively, you can download free ASP code written in JavaScript or VBScript and integrate it into your server-side script. A search on for "asp file upload" yielded a number of solutions. Regardless of the one you choose, you should be sure to test it thoroughly before using it in a production environment.

Here's some code that uses a COM-based solution from Software Artisans, SA-FileUp (http://www.softwareartisans.com/softartisans/saf.html). The code instantiates a "SoftArtisans.FileUp" object, calls the object's SaveAs method to save the uploaded file, and sends a simple response back to the VoiceXML interpreter.

Here's some code that uses a freely downloadable VBScript class written by Jacob Gilley. You can download the class from ttp://www.asp101.com/articles/jacob/scriptupload.asp. The following code assumes that the FileUploader class has been saved to a file named "upload.asp" located in the same directory.

<%@ Language=VBScript %>
<%Option Explicit%>

<%
' Create the FileUploader
Dim Uploader, File, Msg
Set Uploader = New FileUploader

' Parse the uploaded data
Uploader.Upload()

If Uploader.Files.Count = 0 Then
     Msg = "We were unable to save your greeting."
Else
     For Each File In Uploader.Files.Items
          File.SaveToDisk "c:\myapp\recordings\"
     Next
     Msg = "Your greeting was saved."
End If

Response.ContentType = "text/xml"
%>
<vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">
     <form>
          <block>
               <prompt><% =Msg %></prompt>
          </block>
     </form>
</vxml>

If you decide to use the FileUploader class, consider converting the Subs to Functions and return status codes to indicate whether the functions return success or failure.

Speak-n-Listen Column

By Matt Oshry