Tuesday, January 18, 2011

Using HTML5 Speech Recognition with Pandorabots



We found an interesting and simple demo of speech recognition in HTML5 in this presentation on slide 25. To make the demo work with ALICE and Pandorabots, we had to make only a slight change to our custom HTML input files.

Here is a demo of the Fake Captain Kirk bot using speech recognition: Talking Animated Fake Kirk with Voice Input (Requires Chrome browser).


Note: The demo uses the Google Speech API. At the time of this writing, not all browsers yet support this HTML5 feature. We have found it works in Google Chrome.  Furthermore, we have only tested it successfully with Windows 7.

The only change was to replace 

<input type="text" size="60" name="input"/> 
with 
<input type="text" size="60" name="input" x-webkit-speech /> 

There are a couple of "tricks" you need to know to test the speech API:

1. Click on the little mic icon to begin speaking. The speech API will detect when you have finished speaking and, when it has finished processing, will display the text it detected in the text area.

2. If you are satisfied with what the speech API detected, click the "Say" button to transmit it to the bot.


The quality of the speech recognition may not seem very high. There are a number of factors that affect the accuracy of speech recognition: the quality of the microphone, background noise, the accent of the speaker, the type of sound card on your computer and so on. Pandorabots has no control over the Google voice recognition software, but the results should be comparable to using the Google voice API with any other application.

Enabling Chrome Voice API

The Chrome voice recognition API may not be enabled by default. Follow these steps to create a shortcut to Chrome with voice recognition enabled.

1. Create a desktop shortcut to Google chrome with copy and paste. (You may also want to rename this shortcut something like "Chrome Voice").

2. Right click on the shortcut and select "Properties"

3. On the Shortcut tab, modify the Target field to include the flag "--enable-speech-input". For example if the Target was originally 
C:\Users\drwallace\AppData\Local\Google\Chrome\Application\chrome.exe 
change it to 
C:\Users\drwallace\AppData\Local\Google\Chrome\Application\chrome.exe --enable-speech-input

4. Click OK and use this Chrome Voice shortcut to enable speech recognition.

    1 comment:

    1. FYI... I've gotten it to work with Chrome on WinXP. I had a nice, brief conversation with Fake Captain Kirk. :)

      ReplyDelete

     

    blogger templates | Make Money Online