Aldebaran documentation What's new in NAOqi 2.4.3?

ALSpeechRecognition API

NAOqi Audio - Overview | API | Tutorial


Namespace : AL

#include <alproxies/alspeechrecognitionproxy.h>

Methods

std::vector<std::string> ALSpeechRecognitionProxy::getAvailableLanguages()

Returns the list of the languages currently installed on the system.

Example: [‘French’, ‘Chinese’, ‘English’, ‘German’, ‘Italian’, ‘Japanese’, ‘Korean’, ‘Portuguese’, ‘Spanish’]

Returns:List of installed languages (language names are given in English)
std::string ALSpeechRecognitionProxy::getLanguage()

Returns the language currently used by the speech recognition system.

Example: ‘French’

Could be one of the available languages.

For further details, see: ALSpeechRecognitionProxy::getAvailableLanguages.

Returns:Current language used by the speech recognition engine
void ALSpeechRecognitionProxy::setLanguage(const std::string& language)

Sets the language currently used by the speech recognition system. Each NAOqi restart will however reset that setting to the default language that can be set on NAO’s web page.

Parameters:
float ALSpeechRecognitionProxy::getParameter(const std::string& parameter)

Gets a parameter of the speech recognition engine.

Parameters:
  • parameter – Name of the parameter
Returns:

Value of the parameter

void ALSpeechRecognitionProxy::setParameter(const std::string& parameter, const float& value)

Sets parameters of the speech recognition engine.

Parameters:
  • parameter – Name of the parameter.
  • value – Value of the parameter.

Supported parameters: - Sensitivity: Value between 0 and 1 setting the sensitivity of the voice activity detector used by the engine. - NbHypotheses: Number of hypotheses returned by the engine. Default: 1

void ALSpeechRecognitionProxy::loadVocabulary(const std::string& pathToGrammarfile)

Deprecated since version 1.20: use ALSpeechRecognitionProxy::setVocabulary instead.

Loads the vocabulary to recognize contained in a .lcf or .fcf file (NUANCE grammar file format).

Parameters:
  • pathToGrammarfile – Path to the .lcf or .fcf file containing the vocabulary
bool ALSpeechRecognitionProxy::getAudioExpression()

Gets value of the parameter AudioExpression. This parameter indicates if the recognition process will play “bip” or not.

void ALSpeechRecognitionProxy::setAudioExpression(const bool& setOrNot)

When set to True, a “bip” is played at the beginning of the recognition process, and another “bip” is played at the end of the process. This is a useful indication to let the user know when it is appropriate to speak.

Parameters:
  • setOrNot – Enable (true) or disable it (false)
void ALSpeechRecognitionProxy::setVisualExpression(const bool& setOrNot)

Enables or disables the leds animations showing the state of the recognition engine during the recognition process.

Parameters:
  • setOrNot – Enable (true) or disable it (false).
void ALSpeechRecognitionProxy::setVocabulary(const std::vector<std::string>& vocabulary, const bool& enableWordSpotting)

Sets the list of words/phrases (vocabulary) that should be recognized by the speech recognition engine. If word spotting is disabled (default), the engine expects to hear one of the specified words, nothing more, nothing less. If enabled, the specified words can be pronounced in the middle of a whole speech stream, the engine will try to spot them. The parameter enableWordSpotting changes the results given by the speech recognition. Please refer to ALSpeechRecognition for details.

Parameters:
  • vocabulary – List of words that should be recognized
  • enableWordSpotting – Enable (true) or disable it (false)
void ALSpeechRecognitionProxy::setWordListAsVocabulary(const std::vector<std::string>& vocabulary)

Deprecated since version 1.20: use ALSpeechRecognitionProxy::setVocabulary instead.

Sets the list of words/phrases (vocabulary) that should be recognized by the speech recognition engine. To enable “word spotting”, please use ALSpeechRecognitionProxy::setVocabulary instead.

Parameters:
  • vocabulary – List of words that should be recognized
void ALSpeechRecognitionProxy::compile(const std::string& pathToInputBNFFile, const std::string& pathToOutputLCFFile, const std::string& language)

Converts a BNF file to a LCF file. The LCF file is a binary file which contains the same content as the BNF file. Use this file for the method addContext.

Parameters:
  • pathToInputBNFFile – Path to a BNF input file. This BNF file is a set of rules that should be recognized by the speech recognition engine.
  • pathToOutputLCFFile – Define the path were the LCF file will be generateds.
  • language – Name of the language of the BNF file.
void ALSpeechRecognitionProxy::addContext(const std::string& pathToLCFFile, const std::string& contextName)

Adds the context contained in the LCF file. This LCF file contains the set of rules that should be recognized by the speech recognition engine.

Parameters:
  • pathToLCFFile – Path to LCF file to use.
  • contextName – Name of the context.
void ALSpeechRecognitionProxy::removeContext(const std::string& contextName)

Removes one context from the speech recognition engine.

Parameters:
  • contextName – Name of the context to remove.
void ALSpeechRecognitionProxy::removeAllContext()

Removes all contexts from the speech recognition engine.

float ALSpeechRecognitionProxy::saveContextSet(const std::string& saveName)

Saves the current context set under the name saveName.

Saved context sets are lost when restarting NaoQi.

float ALSpeechRecognitionProxy::loadContextSet(const std::string& saveName)

Replaces the currently loaded context set by the one previously saved under the name saveName.

Note: reloading a saved context do not reset its state; i.e. changes made since the save, to its activated rules or slots, are not erased.

float ALSpeechRecognitionProxy::eraseContextSet(const std::string& saveName)

Erases the save named saveName. This will not remove any currently loaded contexts.

float ALSpeechRecognitionProxy::activateRule(const std::string& contextName, const std::string& ruleName)

Activates a rule contained in the specified context.

Parameters:
  • contextName – Name of the context to modify.
  • ruleName – Name of the rule to activate.
float ALSpeechRecognitionProxy::deactivateRule(const std::string& contextName, const std::string& ruleName)

Deactivates a rule contained in the specified context.

Parameters:
  • contextName – Name of the context to modify.
  • ruleName – Name of the rule to deactivate.
float ALSpeechRecognitionProxy::activateAllRules(const std::string& contextName)

Activates all rules contained in the specified context.

Parameters:
  • contextName – Name of the context to modify.
float ALSpeechRecognitionProxy::deactivateAllRules(const std::string& contextName)

Deactivates all rules contained in the specified context.

Parameters:
  • contextName – Name of the context to modify.
float ALSpeechRecognitionProxy::addWordListToSlot(const std::string& contextName, const std::string& slotName, const std::vector<std::string>& wordList)

Adds a list of words in a slot. A slot is a part of a context which can be modified. You can add a list of words that should be recognized by the speech recognition engine.

Parameters:
  • contextName – Name of the context to modify.
  • slotName – Name of the slot to modify.
  • wordList – List of words to insert in the slot.
float ALSpeechRecognitionProxy::removeWordListFromSlot(const std::string& contextName, const std::string& slotName)

Removes all words from a slot.

Parameters:
  • contextName – Name of the context to modify.
  • slotName – Name of the slot to modify.
std::vector<std::string> ALSpeechRecognitionProxy::getRules(const std::string& contextName, const std::string& typeName)

Gets rules corresponding to the specified type. Type can be:

  • “start”: provides entry points into a context
  • “active”: state of a rule, indicates wether the rule is activated or not
  • “activatable”: specifies a rule which can be activated or deactivated
  • “slot”: those rules can be changed during the runtime
Parameters:
  • contextName – Name of the context.
  • typeName – Type of the rules ordered.
float ALSpeechRecognitionProxy::pause(const bool& isPaused)

Stops and restarts the speech recognition engine according to the input parameter. This can be used to add contexts, activate or deactivate rules of a contex, add a words to a slot.

Parameters:
  • isPaused – True (stops ASR) or False (restarts ASR).
void ALSpeechRecognitionProxy::subscribe(const std::string& name)

Subscribes to ALSpeechRecognition. This causes the module to start writing information to ALMemory in “WordRecognized”. This can be accessed in ALMemory using ALMemoryProxy::getData.

Parameters:
  • name – Name to identify the subscriber
void ALSpeechRecognitionProxy::unsubscribe(const std::string& name)

Unsubscribes to ALSpeechRecognition. This causes the module to stop writing information to ALMemory in “WordRecognized”.

Parameters:

Events

Event: "WordRecognized"
callback(std::string eventName, AL::ALValue value, std::string subscriberIdentifier)

Raised when one of the specified words with ALSpeechRecognitionProxy::setVocabulary has been recognized. When no word is currently recognized, this value is reinitialized.

Parameters:
  • eventName (std::string) – “WordRecognized”
  • value – Recognized words infos. Please refer to ALSpeechRecognition for details.
  • subscriberIdentifier (std::string) –
Event: "WordRecognizedAndGrammar"
callback(std::string eventName, AL::ALValue value, std::string subscriberIdentifier)

Raised when the engine produces a result. Same as WordRecognized with an additional information, the name of the grammar used for the recognition.

Parameters:
  • eventName (std::string) – “WordRecognizedAndGrammar”
  • value – Recognized words infos. Please refer to ALSpeechRecognition for details.
  • subscriberIdentifier (std::string) –
Event: "LastWordRecognized"
callback(std::string eventName, AL::ALValue value, std::string subscriberIdentifier)

Deprecated since version 1.20.

Raised when one of the specified words with ALSpeechRecognitionProxy::setWordListAsVocabulary has been recognized. This value is kept unchanged until a new word has been recognized.

Parameters:
  • eventName (std::string) – “LastWordRecognized”
  • value – Last recognized words infos. Please refer to ALSpeechRecognition for details.
  • subscriberIdentifier (std::string) –
Event: "SpeechDetected"
callback(std::string eventName, bool value, std::string subscriberIdentifier)

Raised when the automatic speech recognition engine has detected a voice activity.

Parameters:
  • eventName (std::string) – “SpeechDetected”
  • value – True if voice activity detected.
  • subscriberIdentifier (std::string) –
Event: "ALSpeechRecognition/IsRunning"
callback(std::string eventName, bool value, std::string subscriberIdentifier)

Raised when the speech recognition engine is started.

Parameters:
  • eventName (std::string) – “ALSpeechRecognition/IsRunning”
  • value – True if speech recognition engine is started.
  • subscriberIdentifier (std::string) –
Event: "ALSpeechRecognition/Status"
callback(std::string eventName, AL::ALValue status, std::string subscriberIdentifier)

Raised when the status of the speech recognition engine changes.

Parameters:
  • eventName (std::string) – “ALSpeechRecognition/Status”
  • status

    can be “Idle”, “ListenOn”, “SpeechDetected”, “EndOfProcess”, “ListenOff”, “Stop”.

    Note: “ListenOn” status does not necessarily mean ready to process. For further details, see: ALSpeechRecognition/ActiveListening().

  • subscriberIdentifier (std::string) –
Event: "ALSpeechRecognition/ActiveListening"
callback(std::string eventName, bool value, std::string subscriberIdentifier)

Experimental

Raised with true value when the engine is not only listening but also ready to process data (i.e. not raised when the ASR engine is only recording sound to be processed).

Parameters:
  • eventName (std::string) – “ALSpeechRecognition/ActiveListening”
  • value – True if the engine is listening and processing data, False otherwise.
  • subscriberIdentifier (std::string) –