Author Message
MichaelNorman
Joined: Jun 3, 2015
Messages: 448
Offline
Is speech recognition (Speech to Text) supported in 3.4? Are there some samples to look at for this if so?

Use case would be to play a prompt to a caller and collect the audio and convert to text, or fill a grammar like in Experience Portal. Nuance is the speech engine in use.

BobBraudes
Joined: Dec 19, 2013
Messages: 34
Offline
Hi Michael,

The best (and probably only) way to do this is to write a VoiceXML script, which will allow you to use either a prompt or TTS for the announcement and ASR to collect the response and select the best match from a grammar. Is this possible for your use case?

Thanks,
Bob
MichaelNorman
Joined: Jun 3, 2015
Messages: 448
Offline
Thinking through this. Seems kind of a round about way to do this. Does the API not allow for this? If that is the only way to accomplish it, what would the code look like for this?
BobBraudes
Joined: Dec 19, 2013
Messages: 34
Offline
Here is an example of a VXML script that requests the name of a Beatles song and validates the response against a grammar of recognized songs:

?xml version="1.0"?>

<vxml xmlns="http://www.w3.org/2001/vxml"
xmlns:xsd="http://www.w3.org/2001/XMLSchema-instance"
xmlns:conf="http://www.w3.org/2002/vxml-conformance"
version="2.0">

<catch event="connection.disconnect.hangup">
<log>Caught the event <value expr="_event"/></log>
</catch>

<!-- send digits 1 -->

<form id="MainMenu">
<field name="song">

<prompt timeout="15s">
Please say your favorite Beatles Song.
<audio src="builtin://senddigit/123"/>
</prompt>

<!-- Define the grammar. -->
<grammar xml:lang="en-US" version="1.0" root = "myrule">

<rule id="myrule">
<one-of>
<item> Please Please Me </item>
<item> Eleanor Rigby </item>
<item> Hey Jude </item>
<item> Love me do </item>
<item> Twist and shout </item>
<item> Let It Be </item>
<item> Drive My Car </item>
<item> Come Together </item>
<item> All My Loving </item>
</one-of>
</rule>
</grammar>

<!-- The user was silent, restart the field.
<noinput>
I did not hear anything. Please try again.
<reprompt/>
</noinput>
-->

<!-- The user said something that was not defined in our grammar.
<nomatch>
I did not recognize that song. Please try again.
<reprompt/>
</nomatch>
-->

<!-- Checks the "SongName" value against each of the valid values. -->
<filled>
<!--
<prompt>
I heard <value expr="song$.utterance" /> goodbye
</prompt>
-->
<exit namelist="song$.utterance" />
</filled>

<!--
<nomatch>
<var name="noMatchVar" expr="'No Match Found'" />
<exit namelist="noMatchVar">
</nomatch>

<error>
<var name="errorVar" expr="'Error Received'" />
<exit namelist="errorVar">
</error>
-->
</field>
</form>
</vxml>


Breeze doesn't have a direct API for ASR or STT.

MichaelNorman
Joined: Jun 3, 2015
Messages: 448
Offline
Guess I should have clarified. How would I invoke this from a breeze service?
BobBraudes
Joined: Dec 19, 2013
Messages: 34
Offline
Not a problem. You won't need everything here but here is the base source file for the VXML I listed:

package com.avaya.ReceiveAsrTestService;

import java.net.URI;
import java.util.Date;
import java.util.Map;
import java.util.UUID;

import com.avaya.collaboration.call.Call;
import com.avaya.collaboration.call.CallListener;
import com.avaya.collaboration.call.CallTerminationCause;
import com.avaya.collaboration.call.MediaType;
import com.avaya.collaboration.call.Participant;
import com.avaya.collaboration.call.TheCallListener;
import com.avaya.collaboration.call.speech.search.SpeechFactory;
import com.avaya.collaboration.call.speech.search.SpeechService;
import com.avaya.collaboration.call.speech.voicexml.VoiceXMLDialogCause;
import com.avaya.collaboration.call.speech.voicexml.VoiceXMLDialogItem;
import com.avaya.collaboration.call.speech.voicexml.VoiceXMLDialogListener;
import com.avaya.collaboration.util.logger.Logger;
import com.avaya.collaboration.eventing.EventMetaData;
import com.avaya.collaboration.eventing.EventProducer;
import com.avaya.collaboration.eventing.EventingFactory;
import com.avaya.zephyr.platform.dal.api.ServiceDescriptor;
import com.avaya.zephyr.platform.dal.api.ServiceUtil;
import com.avaya.collaboration.dal.factory.CollaborationDataFactory;
import com.avaya.collaboration.businessdata.api.ServiceData;


@TheCallListener
public class AsrCallListener implements CallListener, VoiceXMLDialogListener {
private final Logger logger;
private SpeechService speechService;
private int maxPrompts = 1;

public AsrCallListener() {
logger = Logger.getLogger(AsrCallListener.class);
}

@Override
public void callAlerting(final Participant alertingParty) {
logger.finest("callAlerting invoked. AlertingParty=" + alertingParty);
}

@Override
public void callAnswered(final Call call) {
final String phase = call.isCallingPhase() ? " calling phase "
: " called phase ";
if (logger.isFinestEnabled()) {
logger.finest("callAnswered ENTER invoked in" + phase);
}
maxPrompts = readNumberOfPromptsAttribute();

final UUID requestIdVxmlCalled = voiceXMLDialog(call.getCallingParty());

logger.finer("call answered adding call=" + call
+ " requestId=" + requestIdVxmlCalled);
VxmlCallData.addCall(requestIdVxmlCalled, call);

logger.finer("call answered adding counter=" + call
+ " counter=1");
VxmlCallData.addCounter(call, 1);

logger.finer("starting dialog with request ID= "
+ requestIdVxmlCalled);
logger.finest("dialogEvent EXIT");
}

@Override
public void callTerminated(Call call, CallTerminationCause cause) {
logger.finest("callTerminated ENTER");

if (cause != CallTerminationCause.ABANDONED
&& cause != CallTerminationCause.AFTER_ANSWER) {
callFailed(call, cause);
}
logger.finest("callTerminated EXIT");
}

@Override
public void callOriginated(Call call) {

}

private void callFailed(final Call call, CallTerminationCause cause) {
logger.finest("callFailed ENTER");
logger.finest("Call failed with cause "
+ (cause == null ? "unknown" : cause.getValue()));

logger.finest("callFailed EXIT");
}

@Override
public void callIntercepted(final Call call) {
logger.finest("callIntercepted ENTER");

call.enableMediaBeforeAnswer();
logger.fine("callIntercepted from "
+ call.getCallingParty().getAddress() + " (name = "
+ call.getCallingParty().getDisplayName() + ") to "
+ call.getCalledParty().getAddress() + " (name = "
+ call.getCalledParty().getDisplayName());
logger.finest("callIntercepted EXIT");
}

private UUID voiceXMLDialog(final Participant participant) {
logger.finer("vxmlDialog ENTER " + participant.getDisplayName());

URI voiceXMLScript = null;

try {
voiceXMLScript = new URI(
"http://10.135.47.16/services/ReceiveAsrTestService/scripts/TTSASRtest.vxml");
// "http://info.dr.avaya.com/~ikes/vxml/TTSASRtest.vxml");
} catch (Exception e) {
logger.warn("Failed to create URI for VoiceXML script");
return null;
}

logger.finest("set script to http://10.135.47.16/services/ReceiveAsrTestService/scripts/TTSASRtest.vxml");
// logger.finest("set script to http://info.dr.avaya.com/~ikes/vxml/TTSASRtest.vxml");

final VoiceXMLDialogItem voiceXMLDialogItem = SpeechFactory
.createVoiceXMLDialogItem().setVoiceXMLScript(voiceXMLScript);
UUID requestId = getSpeechService().startVoiceXMLDialog(participant,
voiceXMLDialogItem, this);

logger.finest("voiceXMLDIalog new request ID is " + requestId);

logger.finer(" vxmlDialog EXIT");
return requestId;
}

private void sendEvent(String body) {
final EventMetaData filterData = EventingFactory.createEventMetaData();
final String eventBody = body;
final String eventVersion = "1.0.0.0";
final EventProducer publisher =
EventingFactory.createEventProducer("ASR",
"ASR_RECEIVED",
filterData, eventBody, eventVersion);

publisher.publish();
}

@Override
public void dialogEvent(UUID requestId, Map<String, Object> names,
VoiceXMLDialogCause cause) {
logger.finest("dialogEvent ENTER, cause is " + cause
+ " request ID = " + requestId);

final Call call = VxmlCallData.getCallByRequestId(requestId);
Integer count = VxmlCallData.getCounterByCall(call);
Date date = new Date(System.currentTimeMillis());

if (names != null) {
logger.finest("names is not null");

final StringBuilder sb = new StringBuilder();
for (String key : names.keySet()) {
sb.append("\n\t" + key + " " + names.get(key));
}

String songName = sb.toString().replace("song%24.utterance ", "");
songName = songName
.replace("msml.dialog.exit msml.dialog.exit", "");
songName = songName.replaceAll("\n", "");
songName = songName.replaceAll("\r", "");
songName = songName.replaceAll("\t", "");
logger.finest("Received requestId = " + requestId
+ " for call = " + call + " ucid = " + call.getUCID()
+ " count = " + count + " namelist = " + songName
+ " cause = " + cause
);
logger.info(";" + date + ";" + songName + ";" + call.getUCID()
+ ";" + count + ";" + requestId);
logger.finest("Publishing event");
sendEvent(songName + "," + call.getUCID());
} else {
logger.info("names is null");
}

logger.finer("dialog event removing call=" + call
+ " requestId=" + requestId);
VxmlCallData.removeCall(requestId);

logger.finer("dialogEvent EXIT UUID is: " + requestId);

logger.finest("COUNT is " + count);

if (count < maxPrompts) {
final UUID newRequestId = voiceXMLDialog(call.getCallingParty());
logger.finest("starting dialog with request ID= "
+ newRequestId + " count=" + count + " call= " + call);

logger.finest("dialog event adding call=" + call
+ " requestId=" + newRequestId);
VxmlCallData.addCall(newRequestId, call);

count++;

logger.finer("dialog event removing counter call ="
+ call);
VxmlCallData.removeCounter(call);

logger.finer("dialog event adding counter call=" + call
+ " counter=" + count);
VxmlCallData.addCounter(call, count);

} else {
logger.fine(""
+ "Dropping call= " + call + " count = " + count);
logger.info(";" + date + ";Dropping call;" + call.getUCID() + ";"
+ count + "; ;");
call.drop();
}

}

@Override
public void mediaDetected(Participant partySendingMedia,
MediaType mediaTypeDetected) {
logger.finest("dialogEvent ENTER");
logger.finer(partySendingMedia.getDisplayName()
+ " mediaTypeDetected = " + mediaTypeDetected);
logger.finest("About call the VoiceXML dialog");
}

@Override
public void addParticipantFailed(Call call, Participant failedParticipant,
CallTerminationCause cause) {

}

private SpeechService getSpeechService() {
if (speechService == null) {
speechService = SpeechFactory.createSpeechService();
}
return speechService;
}

private int readNumberOfPromptsAttribute() {

String numPromptsStr = "1";
int numPrompts = 1;

ServiceDescriptor svc = ServiceUtil.getServiceDescriptor();
if (svc == null)
{
throw new IllegalStateException("Couldn't get service descriptor");
}
final ServiceData svcData = CollaborationDataFactory.getServiceData(svc.getName(), svc.getVersion());
try
{
numPromptsStr = svcData.getServiceAttribute("numPrompts");
numPrompts = Integer.parseInt(numPromptsStr);
}
catch(NumberFormatException nfe)
{
logger.error("NumberFormatException and so using default num_prompts = " + numPrompts);

}
catch (Exception e)
{
logger.error("Exception while getting service attribute num_prompts ", e);
}
logger.info("num_prompts is " + numPrompts);
return numPrompts;
}
}

JoelEzell
Joined: Nov 15, 2013
Messages: 780
Offline
One other note: If you have Engagement Designer in your environment, it has a pretty easy to use IVR Task. This task is ASR-capable, and it uses the API that Bob referenced.
Go to:   
Mobile view