voiceCmdr - voice commands in the Browser

Date: January 23, 2015

voice recognition

Recently I discovered Web Speech API. I was already talking to the browser using Google Hangout and Google Translator, but I have never thought about adding voice support to my own website.

I did some research, and I found a demo. Based on that I put up simple demo website (say: "show website blog", and it will take you directly to the sub page that can be also approached with 3 mouse clicks). For now speech recognition works only in Google Chrome and Safari. In Chrome it is not SpeechRecognition API, but webkitSpeechRegognition API. I hope, in the near future, other browsers will also implement it. Especially Spartan, which is integrated with Cortana.

I noticed that while the API is flexible, it is not easy to use. I think, for most common scenarios, developer would like to be able to add commands associated with function callbacks, and control recognition state with start/stop actions.

I created a JavaScript library voiceCmdr. It is a single .js file without any dependencies. You can install it with npm or bower (check README on github).

You can add commands:

voiceCmdr.addCommand("go home", function () {
  // go to home page
});

Callback function can have parameter, which is everything you said after command. E.g.:

voiceCmdr.addCommand("search", function (param) {
  // search for phrase specified in param
});

You can also remove commands:

voiceCmdr.removeCommand("go home");

In order to start listening for commands:

voiceCmdr.start();

To stop listening:

voiceCmdr.stop();

You can also invoke listening for single command:

voiceCmdr.getCommand();

Check examples, and be aware that Web Speech API works only through http, and https (it will not work if you open static html file). The easiest way to run server is to use python SimpleHTTPServer:

python -m SimpleHTTPServer 8080

This can also go another way (by browser talking to you) with Web Synthesis API.

I am curious what do you think. Are we ready for voice commands in the browser? Are you concerned about your privacy (check Scott Hanselman's tweet)?

Categories: programming

Tags: javascript Open Source voice recognition