Author Message
MacKry
Joined: Jan 18, 2016
Messages: 12
Offline
Hi
I'm not familiar with ASR/TTS (I'm using pure DTMF apps), but I'm wondering if there is a way to force ASR to accept any speech input? Let's say I don't want to specify any grammar, I just would like to allow customer to say whatever he wants, then get from ASR a transcription of what customer said and process it further in the application to achieve sort of natural language input. Is it possible?
IvanFontalvo
Joined: Mar 22, 2012
Messages: 61
Offline
yes , it can be possible, but you need to use not only asr you need to integrate with a conversational system, some thing like nuance vocalizer or other similar.


Regards.
RossYakulis
Joined: Nov 6, 2013
Messages: 2652
Offline
We have been doing some prototyping using Amazon Lex and Poly. In Lex instead of building strict grammars you build models with sample utterances. However you only need a sample not all possible utterances.

The basic model is to record what the user says and have a silence timeout small 250-500 ms then send the recording for analysis to a backend system like Lex or some other engine.
MacKry
Joined: Jan 18, 2016
Messages: 12
Offline
Thanks. I assume from your answer that OD itself doesn't allow to get "natural language" voice input? I's not technically possible to get full transcription of voice input from ASR engine - am I right?
RossYakulis
Joined: Nov 6, 2013
Messages: 2652
Offline
OD is based on VXML and vxml does not directly support "dictation". The work I mentioned above does use OD but instead of Nuance to process the voice we used AWS LEX. This should also be possible with other similar speech processors where you record audio and send it via a webservice to be processed.
MacKry
Joined: Jan 18, 2016
Messages: 12
Offline
Well, that's strange - is it the limitation of VoiceXML itself?. I thought that the flow is as follows:
-> customer's voice goes through AAEP to ASR engine
-> ASR translates speech to text
-> AAEP sends translated input to application server
-> application applies grammar and decides wheter input is valid or not
In that case application would be able to process any users input. If it's not like this then how does speech recognition work?


IvanFontalvo
Joined: Mar 22, 2012
Messages: 61
Offline
The procces is similar as you said.

but any changes :

-> customer's voice goes through AAEP to IVR APP
-> According with the predefined grammar the app validate if is a valid voice input. (Internaly the grammar parse the voice to text)
-> if valid, go foward, if not the app say a message or other.

Application will be no able to proccess any user input just predefined input (GrammaRS). to proccess any input you need as said ross before, include a external engine where you can procces everithing that user said, Recording the user input, send this via webService to the engine, and get the answer according with the engine.

Example : user input Hi, how's going ? , yo send that to the webService and the webService answer with : fine and what about you . you can promt that with tts or if the webService can send to you the audio directly you can promt that too.

Regards
AADave
Joined: Jun 6, 2017
Messages: 1
Offline
Is there any further development with 'We have been doing some prototyping using Amazon Lex and Poly. In Lex instead of building strict grammars you build models with sample utterances. However you only need a sample not all possible utterances. '

very curious how this is proceeding.
RossYakulis
Joined: Nov 6, 2013
Messages: 2652
Offline
In 7.2.1 we have added a recording format that is compatible with Lex and have a demo app available on request that provides and example of using lex/poly. With out 7.2.1 you have to create a record page in vxml to the recording format. In 8.0 we plan more official support.
TheMiker
Joined: Jan 23, 2018
Messages: 1
Offline
RossYakulis wrote:In 7.2.1 we have added a recording format that is compatible with Lex and have a demo app available on request that provides and example of using lex/poly. With out 7.2.1 you have to create a record page in vxml to the recording format. In 8.0 we plan more official support.


How do I request the demo app?

WilsonYu
Joined: Nov 6, 2013
Messages: 3950
Offline
It comes with the 7.2.1 bundle after it's released in March.
Ivalberto
Joined: Sep 5, 2018
Messages: 45
Offline
Hi Wilson, what's the app with the demo ?

i checked the Orchestration Designer Sample Applications.doc but i did not see a reference

Thanks in advance.
WilsonYu
Joined: Nov 6, 2013
Messages: 3950
Offline
Please contact our field consultant Tore Christensen
Ivalberto
Joined: Sep 5, 2018
Messages: 45
Offline
Thanks Wilson, Another question.

you said "With out 7.2.1 you have to create a record page in vxml to the recording format." , i´m working with a old version of OD the 7.0.x

Can i create the audio file with the new format on that version ? if, yes , how can i do that ?

Best regards.
WilsonYu
Joined: Nov 6, 2013
Messages: 3950
Offline
I don't think you really need to create the vxml manually. You can override the updateRecord method in the Java class for the existing Record node to pass in the special audio type. For example:

public void updateRecord(Record record, SCESession mySession) {
....

//You would copy some code from the exiting getRecord method. The main idea is to pass in the type "audio/L16;rate=8000"

record = new com.avaya.sce.runtime.Record("Talk", false, "60s", true, "2s", true, "audio/L16;rate=8000", com.avaya.sce.runtime.Record.RECORDMODE_AUDIO, promptNames, grammarInfo, events);


super.updateRecord(record, mySession);
}
Go to:   
Mobile view