Initial Setup Properties |
||
Runtime Properties |
||
Greetings |
||
Methods |
||
Events |
||
Overview
The DynGrammar Control is used with the Nuance speech recognition engine.
The DynGrammar control allows you to manipulate (select, load and create) Nuance Dynamic Grammars at runtime, from your program or via a process of voice-based enrolment. The grammars are then used for recognition in a VoiceRec control.
This control has four input nodes:
-
Select: selects a dynamic grammar to be inserted into a static grammar placeholder.
The VoiceRec control may then be used to recognize on the selected grammar.
-
Add Text Phrase: adds text to a dynamic grammar database.
-
Remove Phrase: removes a specified phrase from a dynamic grammar database.
-
Enroll: starts an enrolment session, which collects sample utterances until the enrolment process generates a consistent pronunciation.
Once a phrase has been enrolled, it is immediately compiled and ready for recognition.
important: This section refers heavily to the Nuance technology called Dynamic Grammars and expects the reader to understand the relevant concepts, configuration, and architecture. Please review the Nuance documentation before proceeding. You also need to be familiar with the VoiceRec control, used in tandem with the DynGrammar control for speech recognition.
Dynamic Grammar
Nuance Dynamic Grammars are different from the regular static grammars mainly because they are not fixed at design time. This type of grammar that can be dynamically created and modified by an application at runtime. Dynamic grammars should be used when the complete set of items to be recognized cannot be determined until runtime. Dynamic grammars are created and stored in a database along with a (user provided) unique id ready to be used immediately.
Enrolment Session
A user enrolls a new phrase during an enrolment session. A session consists of a number of utterance enrolments, ending when the number of consistent pronunciations is reached, the maximum error count is reached or when the application decides to abort. Typically, at least two consistent pronunciations are required (the minimum number may be specified by setting the MinNumConsistentProns property).
During the enrolment session, the application plays the EnrollGreeting (asking the user to speak the new phrase) and waits for an utterance. After the spoken utterance is processed by the recognition server, an EnrollUtteranceDone event is fired.
Once a robust pronunciation is obtained, the EnrollSessionDone event is fired. If the application does not abort, the phrase is committed to the grammar (along with other information such as its natural language interpretations and its probability). If commit succeeds, the ConfirmGreeting is played and the call exits through the OK node. If commit fails (for example, because of a larger number of clashes) the InvalidGreeting is played and the call exits through the BadDigits node.
If a digit is pressed during enrolment, the enrolment session is aborted, the call exits through the DTMF node and the digit is returned in the Words property.
Note that the pronunciations generated through enrolment are heavily dependent on characteristics of both the utterance and the speaker. Therefore, pronunciations obtained through enrolment will not necessarily be accurate pronunciations for the same utterance when spoken by a different user. Typically, your applications should be designed so that pronunciations generated through voice enrolment are used to recognize utterances only from the speaker who enrolled them.
Errors and Silence - (Invalid digits, No digits handling)
If the recognition fails during enrolment (for example, no speech is received), the control allows the user to try again up to a preset number of attempts. The number of retries on error and the number of retries on silence can be set at design time. If the error count exceeds the number of retries set, the DynGrammar control will perform error handling as described below.
If the error count has not been exceeded, and an invalid word has been received, VBVoice will play the EnrollGreeting, and increment the error count. If no words have been received, VBVoice will play the SilenceGreeting and increment the error on silence count. The default SilenceGreeting is empty. If retry on silence is set to False the call will exit through NoDigits node after the first silence error.
If the recognition fails due to no speech and the silence retry count is exceeded, the call will be transferred to the control connected to NoDigits connection. If the recognition fails due to other reasons, the call will be transferred to the control connected to Invalid Digits connection, if this has been enabled using the Use default error handler check box. If the Invalid Digits and No Digits nodes have not been set it will attempt to invoke one of the default error handlers.
Example
Use of this control is shown in the example VBVoice project DynGram.
Licensing
See VoiceRec Control.
Initial Setup Properties
BeepBeforeSpeech
Boolean
See BeepBeforeSpeech.
ClearDigits
(Boolean)
See ClearDigits.
DBAuth
(String)
This property contains the username and password needed to connect to a relational database. The string should be in the username:password format.
DBFormat
(String)
This property contains a string identifying the data types for the database provider. Your database provider should support variable length binary data that can be fetched and written piece by piece. For example, Microsoft SQL Server 7.0 supports IMAGE data types. If you do not specify this option, the data type LONG RAW is used by default, which may not be the right data type for your database provider.
DBName
(String)
This property contains the name of the database, either file system based or Oracle database.
DBProvider
(String)
This property contains a string identifying the database provider. The only supported values are fs for file system and oci for Oracle.
DBRoot
(String)
This property contains a string identifying the database root directory. Used for file system only.
DBServer
(String)
This property contains a string identifying the database alias used to connect to database via network. Used for Oracle only.
DisableHelp
(Boolean)
Set to TRUE to disable the help digit handler. If not set (default), then if a help digit is detected (as defined in the LineGroup control), the call transfers to either the control set in the Connections property page or the LineGroup help digit output. See Help Digit. This property can be set in the Terminations page.
DisconnectControl
(String)
See Responding to Caller Hangup.
GlobalToneControl
(String)
See Global Tone Handling.
GrammarFile
(String)
GrammarName
(String)
See GrammarName.
HelpDigitControl
See Help Digit
IBargeIn
(Boolean)
See IBargeIn.
IDBKey
(String)
This property contains a string used to identify a dynamic grammar in the specified database.
ILabel
(String)
This property contains a string specifying the point of insertion in the static grammar. Please refer to the Nuance documentation for an explanation of labels and insertion points.
IMaxSil
Integer
See IMaxSil.
IMaxKeys
Integer
Not used.
IMinNumConsistentProns
(Integer)
This property contains an integer specifying the minimum number of consistent pronunciations needed for a valid enrolment session.
IInsertDuration
(enumeration)
This property contains an enumeration specifying how long the insertion of the dynamic grammar persists (how long to recognize). If set to vbvInsertPerCall, the inserted dynamic grammar is cleared after a hang-up occurs. If set to vbvInsertPermanent, the setting persists until the next call to InsertDynamicGrammar with the same label.
InvalidErrorControl
String
See Invalid Digit, No Digits and Silence Timeout.
IRecordDirectory
String
See IRecordDirectory.
IRecordFilename
String
The name of the file containing the valid utterances. This setting is used when IRecordUtterance is checked.
IRecordUtterance
Boolean
See IRecordUtterance.
IReleaseEngineOnExit
(Boolean)
See IReleaseEngineOnExit.
ITermDtmf
(Integer)
Not used.
MaxRetries
Integer
See MaxRetries.
NoDigitsErrorControl
String
See Invalid Digit, No Digits and Silence Timeout.
NumRetriesOnSilence
(Integer)
This property sets the maximum number of retries on silence before the error handler is invoked. This property can be set in the Setup property page.
RetryOnSilence
(Boolean)
See RetryOnSilence.
UseDefaultError
(Boolean)
See UseDefaultError.
Runtime Properties
BargeIn
(Channel as Integer)Boolean
See BargeIn.
CompilationConfig
(Channel as Integer)String
This property contains a string specifying the name of a running compilation-server process. May be NULL if only one recognition package and no named compilation options have been specified. Please refer to the Nuance documentation for an explanation of Compilation Config parameters.
DBKey
(Channel as Integer)String
This property contains a string used to identify a dynamic grammar in a specified database.
GotoNode
(Integer)
This property will transfer a call to another control. See GotoNode.
InsertDuration
(Channel as Integer)Enumeration
This property contains an enumeration specifying how long the insertion of the dynamic grammar persists (how long to recognize). If set to vbvInsertPerCall, the inserted dynamic grammar is cleared after a hang-up occurs. If set to vbvInsertPermanent, the setting persists until the next call to InsertDynamicGrammar with the same label.
Label
(Channel as Integer)String
This property contains a string specifying the point of insertion in the static grammar. Please refer to the Nuance documentation for an explanation of labels and insertion points.
MaxKeys
(Channel as Integer) Integer
See IMaxKeys.
MaxSil
(Channel as Integer) Integer
See MaxSil.
MinNumConsistentProns
(Channel as Integer)String
This property contains an integer specifying the minimum number of consistent pronunciation needed for a valid voice enrolment session (default is 2).
NuanceGrammar
(Channel as Integer)String
See NuanceGrammar.
OKToOverWrite
(Channel as Integer)Boolean
This property contains a Boolean. When set to True, it indicates to overwrite any grammar already existing at the specified key. If you specify False and the given record already exists, the operation fails and the database is unchanged.
PhraseID
(Channel as Integer)String
This property contains a string. You select the value of this string. To delete this phrase or to retrieve its contents, refer to it by this ID. Multiple phrases can have the same ID, in which case a delete operation will delete all of them.
WARNING: This string may not contain white space or capital letters! Use underscore instead.
PhraseNL
(Channel as Integer)String
This property contains a string specifying the natural language statement to execute when this phrase is recognized. May be NULL. Please refer to the Nuance documentation for an explanation of Natural Language.
PhraseText
(Channel as Integer)String
This property contains a string specifying the initial contents of a dynamic grammar.
Probability
(Channel as Integer)String
This property contains a double specifying the probability to apply to this branch of the grammar. The probabilities are normalized to sum to one at compilation time, so use the default 1 if you want all branches to have the same probability.
RecordDirectory
(Channel as Integer)String
See RecordDirectory.
RecordFilename
(Channel as Integer)String
See RecordFilename.
ReleaseEngineOnExit
(Channel as Integer)Boolean
See ReleaseEngineOnExit.
Words
(Channel as Integer) String
See Words.
Greetings
ConfirmGreeting
This greeting is played after a successful enrolment session (the phrase has been added to the Dynamic grammar).
EnrollGreeting
This greeting is played after each utterance enrolment. After playing this greeting, another utterance enrolment is started.
EntryGreeting
The EntryGreeting is only ever played when entering the DynGrammar control via the Enroll entry node. This greeting is used to prompt the caller to say a phrase. For example, Please say your name.
InvalidGreeting
This greeting is played after an enrolment session failure.
SilenceGreeting
This greeting is played if a response is not heard from the caller. After playing this greeting, the EnrollGreeting is played and a new utterance enrolment is started.
UnrecognizedGreeting
Not used.
Methods
All the methods need a recognition engine, therefore they should be called only after the system has started and at least one Nuance engine has been created. The first parameter is the number of the channel to be used. If the methods are used during a call, in a controls event, it is recommended to use the current channel as channel parameter.
AddTextPhrase
The syntax is:
AddTextPhrase(channel as Integer, DBKey as String, Phrase_ID as String, Phrase_Text as String, Phrase_NL as String, Probability as Single, Compilation_Config as String) As Integer
This method adds a single text phrase to an existing dynamic grammar in the database.
The DBKey parameter is the string used to identify the dynamic grammar in the specified database. Use Phrase_ID to select the value of this string and refer to it later. Phrase_Text is the actual contents of the grammar (can be a GSL expression). Phrase_NL is the natural language statement to execute when this phrase is recognized (may be NULL). Probability specifies the probability to apply to this branch of the grammar. The probabilities are normalized to sum to one at compilation time, so use the default 1 if you want all branches to have the same probability. Compilation_Config is the name of a running compilation-server process. May be NULL if only one recognition package and no named compilation options have been specified. Returns 0 if successful or non 0 if unsuccessful.
EXAMPLE |
To add the city Boston to dynamic grammar with the DBKey Cities, phrase_id city_boston and probability normalized to 1.0: DynGrammar1.AddTextPhrase (channel, Cities, city_boston, boston, , 1.0, ) |
DeleteDynGrammar
The syntax is:
DeleteDynGrammar(channel as Integer, DBKey as String) As Integer
This method attempts to delete a dynamic grammar from the database.
The DBKey parameter is the string used to identify this dynamic grammar in the specified database to delete. Returns 0 if successful or non 0 if unsuccessful. Note that all Phrases specified with DBKey will be deleted from the dynamic grammar database if you call this method.
EXAMPLE |
To delete a dynamic grammar with DBKey named Cities: DynGrammar1.DeleteDynGrammar(channel, Cities) |
NewDynGrammar
The syntax is:
NewDynGrammar (channel as Integer, DBKey as String, GSL_Expression as String, Compilation_Config as String, OkToOverWrite as Boolean) As Integer
This method attempts to create a new dynamic grammar and add it to an open database. Either creates an empty dynamic grammar or creates a dynamic grammar containing a specified GSL expression, depending on the contents of GSL_Expression. If a GSL expression is specified, you cannot modify or extend the created grammar; instead, you would delete it and create a new one with the same key.
The DBKey parameter is the string used to identify the dynamic grammar in the specified database. Phrase_Text parameter is a string specifying the initial contents of a dynamic grammar; it could be a GSL expression or an empty string. Compilation_Config is the name of a running compilation-server process. May be NULL if only one recognition package and no named compilation options have been specified. The OkToOverWrite parameter indicates whether or not to overwrite any grammar already existing at the specified key. Returns 0 if successful or non 0 if unsuccessful.
EXAMPLE
To create a new empty dynamic grammar database named Cities:
DynGrammar1.NewDynGrammar (channel, Cities, , , True)
RemovePhrase
The syntax is:
RemovePhrase(channel as Integer, DBKey as String, Phrase_ID as String, Compilation_Config as String) As Integer
This method removes a phrase from a dynamic grammar. You can remove any phrase as long as it was not added via a GSL expression. Note that if the specified phrase ID is associated with multiple phrases, all those phrases are removed.
The DBKey parameter is the string used to identify the dynamic grammar in the specified database. The Phrase_ID is the phrase identifier to remove. Compilation_Config is the name of a running compilation-server process. May be NULL if only one recognition package and no named compilation options have been specified. Returns 0 if successful or non 0 if unsuccessful.
EXAMPLE
To remove a phrase identified by city_boston in the DBKey named Cities:
DynGrammar1.RemovePhrase (channel, Cities, city_boston, )
TakeCall
See TakeCall.
Events
Disconnect
See Disconnect Event.
EnrollSessionDone
Sub xx_EnrollSessionDone(ByVal channel As Integer, Abort As Boolean)
This event occurs after the EnrollUtteranceDone event and before the Exit event occurs. The Abort parameter allows you to end the current enrolment session without adding the enrolment contents to a database. To abort the session set this value to True.
EnrollUtteranceDone
Sub xx_EnrollUtteranceDone(ByVal channel As Integer, ByVal NumGoodRepetitions As Integer, ByVal NumRepetitionsStillNeeded As Integer, ByVal NumClashes As Integer, Abort As Boolean)
This event occurs after every enrolment of an utterance. This event happens before the EnrollSessionDone event occurs. The NumGoodRepetitions parameter is the number of good repetitions obtained so far. The NumRepetitionsStillNeeded parameter is the number of good repetitions still needed for a valid enrolment session (default is set to 2). The NumClashes parameter displays the number of clashes with other utterances in the same grammar. The Abort parameter allows you to abort the enrolment session, by setting its value to True.
Enter
See Enter, EnterB Events.
Exit
See Exit Event.
NoLicenseAvailable
See NoLicenseAvailable.
PhraseError
See PhraseError Event.
PlayRequest
See PlayRequest Event.
VoiceError
See VoiceError Event.
DynamicGrammar Terminations Property Page
Use default error handler
(UseDefaultError property)
This check box is set by default. When the maximum retries for invalid digits or retries have been exceeded, the system will check for an error handling control or a connection on the LineGroup error output. If these conditions are not true, the ERROR.WAV file is played and the call is terminated. If this check box is not checked, standard error and silence processing is used. See Global Events.
Retry on silence
(RetryOnSilence property)
This check box is set by default. If this box is checked, silence time-out event use the retry on silence count. If this box is unchecked, then a silence time-out will cause the call to exit via the silence output immediately, regardless of the number of retries set. Other errors increment the error retry count as usual.
Clear digits on entry
(ClearDigits property)
Check this check-box if you want to clear all previously collected digits from the VBVoice digit buffer.
Termination Conditions
Maximum silence
(IMaxSil property)
This field specifies the number of seconds that DynGrammar will wait for a word. If a word is not received in the time, recognition will be terminated.
Number of retries on error
(Retry OnError property)
This field specifies the number of invalid or unrecognized recognition attempts, or silence errors that can occur before DynGrammar passes the call to the NoDigits or Invalid nodes, or invokes the default error handler.
Maximum time for speech
This field specifies the maximum time for which the control will listen to and analyze speech. After this time the control will stop listening and attempt to analyze the speech heard up to this point. The default 0 means that there is no maximum time.
DynamicGrammar Setup Property Page (Nuance)
Beep before speech
(BeepBeforeSpeech property)
If this box is checked, a beep will be played before starting recognition.
Insert Per Call
InsertDuration(property)
This property specifies how long the insertion of the dynamic grammar persists. If checked, the inserted dynamic grammar is cleared after a hang-up occurs. If this box is not checked, the setting persists until the next call to InsertDynamicGrammar with the same label. The same functionality is available to the application at runtime by setting the InsertDuration property.
Grammar To Load
(GrammarName property)
This field provides the name of the grammar to load from the Grammar File. If no name is set, the default grammar is used.
Label
(Label property)
This field provides the name of the label specifying the point of insertion in the static grammar. Please refer to the Nuance documentation for an explanation of labels and insertion points.
DBKey
(DBKey property)
This field provides the name specifying a dynamic grammar in a specified database.
Minimum Number of Consistent Pronunciations
(MinNumConsistentProns property)
This field provides the number specifying the minimum number of consistent pronunciation needed for a valid voice enrolment session (default is 2).
Maximum Number of Retries
(MaxRetries property)
This field provides the maximum number of retries before the error handler (No digits or Invalid digit) is invoked. See also RetryOnSilence (default is 3).
DynamicGrammar Database Setup Property Page (Nuance)
DB Provider
(DBProvider property)
This property contains a string identifying the database provider. The only supported values are fs for file system, oci for Oracle and odbc for Microsoft SQL Server 7.0.
DB Name
(DBName property)
This field contains the name of the database, either file system based or relational database used for the Nuance Dynamic Grammar.
DB Server
(DBServer property)
This field contains the name identifying the database alias used to connect to database via network. Used for relational database ODBC or Oracle only.
DB Auth
(DBAuth property)
This field contains the name identifying the username and password needed to connect to a relational database. The string should be in username:password format.
DB Format
(DBFormat property)
This property contains a string identifying the data types for the database provider. Your database provider should support variable length binary data that can be fetched and written piece by piece. For example, Microsoft SQL Server 7.0 supports IMAGE data types. If you do not specify this option, the data type LONG RAW is used by default, which may not be the right data type for your database provider.
DB Root
(DBRoot property)
This field contains the name identifying the database root directory. Used for File System only.
DynamicGrammar Nuance Property Page (Nuance)
Barge-in
IBargeIn property
If checked, the prompts will be stopped when the speech starts. This check box sets the IBargeIn parameter.
Release engine on exit
(ReleaseEngineOnExit property)
If checked, it will release Nuance recognition engine on exit from the control. It is recommended to release the engine when leaving the last DynGrammar control and no more voice recognition is needed for the rest of the call. For best performance the engine should be allocated for the duration of the call (AllocEnginePerCall=1 in VBVOICE.INI, Nuance section) and released when no more recognition is needed.
Record recognized utterances
IRecordUtterance property
If checked, it will enable the recording of recognized utterances.
Recording directory
IRecordDirectory property
This fields sets the directory where the recognized utterances are to be saved.
Recording filename
IRecordFilename property
This fields sets the name of the file containing the recognized utterances.