Using Text-to-Speech in VBVoice

Supported Engines

TTS engines vary widely in performance, price, and available languages. At the same time, the quality of generated voice is very subjective and its perception highly depends on the application. Even different languages within one product family from a single vendor may perform quite differently.

Therefore, to meet the wide range of customer expectations, VBVoice is integrated with number of engines:

Depending on the engine, different integration technologies have been used to achieve optimum voice quality and performance. Mrcp support is using MRCPv2 which is SIP based, while all the other engines are integrated through their native APIs.

For an up-to-date list of supported languages and their version for each TTS engine, please check the VBVoice Release Notes.

Licensing

Text-to-speech is licensed by the number of concurrent TTS sessions. The total number of sessions used on the network is controlled by the Runtime Manager (RTM). At startup, every VBVoicemachine on the network contacts the RTM to negotiate its own limit of the concurrent TTS sessions. Once started, all channels on this machine can dynamically share the pool of TTS sessions up to the negotiated maximum. The number of sessions that the machine asks for during the startup is defined by the INI setting NumberOfEngines in [TTS] (default is 1) or the number of channels defined in all linegroup controls in the system, whichever is less.

In VBVoice, a TTS session is associated with playing a single TTS phrase. When a call starts to play a greeting containing a SayText phrase, one TTS session is used for the duration of the play. However, implementation for Mrcp engine is an exception from this rule; the session is created the first time is needed in a call and is kept until that call ends. This is to avoid high the payload of SIP session creating and ending.

If the system tries to play a greeting containing a SayText phrase and the number of concurrent TTS sessions has been exceeded, then a VoiceError event occurs, with an error code of 252. In this event the TakeCall method can be used to route the call to another control.

System Phrases

TTS is used by VBVoice to say text when it encounters the System Phrase type SayText. Any text specified in the phrase definition will be vocalized by the Text-to-Speech engine. This text can be set at runtime using the Greeting and Phrase properties, and it can also include properties from other VBVoice controls using the %% syntax.