VoiceXML::Client: Supported VoiceXML

A brief VoiceXML tutorial and a detailed description of the supported elements with examples.

VoiceXML Introduction

VXML provides for TUI forms, similar to the standard HTML <form />, to be "displayed" to users (through recorded prompts), that can accept input that fills fields within the form and send data back to the server.

Each user input field is set based on user DTMF input (though these could be filled with voice input in the VoiceXML specs, this is not yet supported) and other variables can be created and manipulated in the VXML.

Conditionals can check the value entered by the user, and jump to other forms within the page (or to other pages). Simple arthmetic can also be performed as demonstrated by the invalidcount counter.

and more complex actions may be performed by submitting data collected through a submit element:

would fetch the page output by the same script but would also pass along the values of each of the variables specified in the namelist. Very useful when dynamically generating the VoiceXML...

Finally, please note that though many of the examples here are just snippets of VoiceXML, the actual VoiceXML passed to VoiceXML::Client needs to be well formed and have the standard

 <?xml version="1.0" ?>
 <vxml version="2.1">
 
 <!-- whatever else -->
 
 </vxml>

around it to actually work. See the word of warning for a bit more info on this front.

A Concrete Example

We'll look at a portion of an example VXML page, one generated by VOCP system to function as a guest audio bulletin board. The full example is available as exampleBBSBox.vxml in the VoiceXML::Client distribution or right here.

 <?xml version="1.0" ?>
 
 <vxml version="2.1">
 
 	<var name="invalidcount" />
	<var name="haverecordedmessage" />
	<var name="msgtoplay" />
                 
	<assign name="invalidcount" expr="0" />
	<assign name="haverecordedmessage" expr="0" />
 
	<form id="guestbox">
	        <field name="movetosel" type="digits" numdigits="1">
	        
		       <prompt timeout="5s">
			      
			       <audio src="/path/to/vocpmessages/guestbox500.rmd" />
			      
			       <audio src="/path/to/vocpmessages/system/guestinstructions.rmd" />
		       </prompt>
		       
		       <filled>
 
			      <if cond="movetosel == 1">
				     <goto nextitem="listennext" />
				     
			       <elseif cond="movetosel == 2" />
				     <goto nextitem="leaveyourown" />
				     
			       <else />
				     <assign name="invalidcount" expr="invalidcount + 1" />
				     <audio src="/path/to/vocpmessages/system/invalidselection.rmd" />
				     <reprompt />
			      </if>
		       </filled>
	        </field>
	</form>
	
	<!-- continued below... -->

The engine starts from the top of the page, sets up the variables according to the <var /> and <assign /> tag instructions, then enters the first field in the first form found (the movetosel field in the guestbox form in this example). It plays the prompts in order and then awaits user input (or the specified 5 second timeout) before proceeding into the <filled /> element.

The value entered by the caller is assigned to the variable corresponding to the field, movetosel. If the value is equal to 1, the engine is instructed to jump to the listennext form, whereas it will jump to leaveyourown if it is equal to 2.

If another value, or none is entered, the <else /> condition is triggered. In this case, the invalidcount variable is incremented, an error message is played and the process starts over at the beginning of the field.

The leaveyourown form, which allows callers to record their own message looks like:

	
 	<form id="leaveyourown">
 		
 		<block>
 			<if cond="haverecordedmessage">
 				<audio src="/path/to/vocpmessages/system/messagealreadyleft.rmd" />
 				<goto nextitem="guestbox" />
 			</if>
 		</block>
 		
 		<assign name="haverecordedmessage" expr="1" />
 		
 		<block>
 			<audio src="/path/to/vocpmessages/system/recordyourmessage.rmd" />
 		</block>
 			
 		<record beep="true" name="/path/to/vocpspool/somefilename.rmd" />
 			
 		
 		
 		<block>
 			<audio src="/path/to/vocpmessages/system/thanksformessage.rmd" />
 		</block>
 		
 		<assign name="leavemsgfname" expr="'/path/to/vocpspool/somefilename.rmd'" />
 		<assign name="action" expr="'msgrecorded'" />
 		<submit next="http://voicexml.psychogenic.com/vocp.cgi" namelist="leavemsgfname action box step"/>
 	</form>
	
	
	<!-- continued below... -->

In this page, the server actually keeps track of whether you've already left a message and won't allow you to leave more than one. This is done with the haverecordedmessage variable, which is initialized to a true value in that case which shunts you back to the top level form after playing a message thanks too the test in the first <block /> in the form.

If the caller has passed this test (no message left yet), haverecordedmessage is set to 1 and the message is recorded after prompting to caller to speak up.

Once the message is saved, the caller is thanked and the VoiceXML client notifies the server of the new message's arrival by submitting to the (VOCP) CGI.


	<form id="listentoothers">
		
		<assign name="invalidcount" expr="0" />
		
		<field name="listenselection" type="digits" numdigits="1">
			
			<prompt timeout="4s">
				<audio expr="msgtoplay" />
				<audio src="/path/to/vocpmessages/system/listenothersmenu.rmd" />
			</prompt>
			
			<noinput>
				<audio src="/path/to/vocpmessages/system/listenothersmenu.rmd" />
				<reprompt />
			</noinput>
			<noinput count="3">
				<goto nextitem="notworkingout" />
			</noinput>
			
			<filled>
			
				<if cond="listenselection == 1">
			
				    <goto nextitem="listenprev" />
		  

			 	  <elseif cond="listenselection == 3" />
			
				    <goto nextitem="listennext" />
		  
			
				    
			 	  <elseif cond="listenselection == 4" />
				
				    <reprompt />
				    
			 	  <elseif cond="listenselection == 6" />
			
				    <goto nextitem="ratemessage" />
				    
				    
			 	  <elseif cond="listenselection == 8" />
			
				    <goto nextitem="guestbox" />
				
		  
				  <else />
				    <audio src="/path/to/vocpmessages/system/invalidselection.rmd" />
				    <assign name="msgtoplay" expr="''" />
				    <reprompt />
				 </if>
			</filled>
		</field>
	</form>

	
	<form id="listennext">
		<block>
			<if cond="currentmessage >= 5">
				<if cond="msgmoreavailable" >
					<assign name="nextdirection" expr="'next'" />
					<goto nextitem="getmoremessages" />
				<else />
					<audio src="/path/to/vocpmessages/system/thisisthelastmessage.rmd" />
					<assign name="msgtoplay" expr="" />
					<goto nextitem="listentoothers" />
				</if>
			</if>
		</block>
		
		<assign name="currentmessage" expr="currentmessage + 1" />
		<goto nextitem="setmsgtoplay" />
	</form>
	
 </vxml>

Supported Elements

Currently, VoiceXML::Client does not support the full Voice Extensible Markup Language specification, but it does enough to allow for some powerful processing and dynamic interaction. The following VoiceXML elements are recognized and supported, at least partially:

Interpreter

Variables may be set and manipulated while executing the contents of a voicexml page. The Voice Extensible Markup Language specifications are, currently, far from respected in this regard within VoiceXML::Client. Instead of using a full ECMAScript interpreter, a simple Perl interpreter was created that is, to date, sufficient for most of our requirements.

Note that, unlike numeric values, strings need to be 'quoted'. Comparison operators may be used in the cond attribute of if/elseif elements, and simple arithmetic may be performed in expr attributes:

Variables may also be used in places like <audio> tags, instead of explicitly setting the src you can:

This implementation is far from complete and has a number of limitations (e.g. though expr=``invalidcount + 1'' currently works, changing the order or adding two variables would likely fail) but a number of examples can be seen within the included exampleBBSBox.vxml file and on http://voicexml.psychogenic.com/

A warning concerning your VoiceXML

MiniXML, the pure Perl parser used to grok the VoiceXML, uses regular expressions and recursion to do it's thing. The examples here are all well formed, meaning unary tags have the <closingslashes /> and every <tag> has a matching closing </tag>.

If your XML is messy, who knows what'll happen... Actually, MiniXML has recently been updated such that it will refuse to parse XML that is patently invalid so you can be sure your computer won't explode. If MiniXML is unhappy, it will output a list of tags to give you a clue where to look.