Speech recognition in BGE

Hello!

I tried to move objects in the bge with Speech recognition and here are the results:

Microsoft:

http://www.mediafire.com/download/d4813jhdykskh29/speechRecognition2.rar Windows 32 bits

http://www.mediafire.com/download/8b1787eiki2zvf9/speechRecognition64.rar Windows 64 bits

http://www.mediafire.com/download/6s6ngrm1et83a6x/speechRecognition1Thread1Loop.blend updated testing .blend (works fine now)

http://www.mediafire.com/download/08y8wuwnea1qsow/text2speech.blend (text to speech testing .blend)

(For text to speech on Linux, you can install espeak from the synaptic package manager and call it from Blender in a thread.

Default testing code:

import subprocess
text = ā€˜ā€œHello worldā€ā€™
subprocess.call('espeak '+text, shell=True)

       http://www.mediafire.com/download/7wfjsy3lb5z88hy/text2speechLinux.blend )

Google:

http://www.mediafire.com/download/aglmwkrw6d0ejeg/speechRecognition_google32.rar Windows 32 bits

http://www.mediafire.com/download/d8fjn00cf3h4c9r/speechRecognition_googleWindowsX64.rar Windows 64 bits

http://www.mediafire.com/download/fe0x5rmxmiojp5a/speechRecognition_google_Linux_x64.tar.gz Linux 64 bits (testing .blend updated in this version)

(For better results on Linux, you can play with alsamixer settings (sudo alsamixer))

I hope it will entertain you!

(You just have to replace the words ā€œAvanā€, ā€œReculā€, ā€œDroitā€, ā€œGaucheā€ (french words) by ā€œupā€, ā€œdownā€, ā€œleftā€, ā€œrightā€ or something. Take care about the upper/lower case (or adapt the script to lower the case (string.lower())

It uses pyspeech (which uses Microsoft speech recognition) and some pywin32 libraries and PyAudio needed by pyspeech)

The second version uses speech recognition 1.1.4 and works with Google speech. It also needs PyAudio.

Tested on windows 8.1 64 with 32 bits and 64 bits versions of Blender and Linux Mint 17.1 64 bits with a 64 bits version of Blender.

The procedure of installation (to eventually update for futures versions of Blender) is described in a post below.

You mean you have to talk to the bge to get it to do something?

is there any way to build it not dependent on windows libs?

@Lostscience: Yes! Tell you microphone to go forward, and the cube will move forward. etcā€¦

A modified version for english windows OS:

http://www.mediafire.com/download/d4813jhdykskh29/speechRecognition2.rar

EDIT: Sorry, line 14 you have to replace phrase.lower() by phrase = phrase.lower()

place the .blend in the modules directory, test it, and tell me if that works! (Iā€™ve written ā€œforwardā€ to go forward, ā€œbackwardā€, ā€œleftā€ and ā€œrightā€ but you can write whatever you want.

if ā€œjumpā€ in own[ā€œactionā€]: do somethingā€¦

@BPR: Pyspeech uses Microsoft Speech Recognition so it uses windows API (pywin32). So you have to run on Windows to test it. But there are certainly other speech recognition python modules which work with other speech recognition softwares on other OS.

EDIT: I still have some errors. Iā€™ll check if I can do something

I do not have a microphone.I donā€™t know if many people do.

Awesome.
I tried working with audio input in the past but i could only get volume based input (i.e shout louder at a car to make it go faster) itā€™s amazing if you can get real speech recognition working.

Thanks! Now I try with Google speech but I have some issuesā€¦ Iā€™ll update the post if I make it work :slight_smile:

EDIT: update of the testing .blend: http://www.mediafire.com/download/se3faid3nis4pdm/speechRecognition1Thread1Loop.blend

(just 1 thread with a while loop. Works better on my computer)

Iā€™d be interested to see how Googleā€™s API performs. Never bothered with it because I figured the latency would be too high for my needs.
For Linux Iā€™ve been piping voice commands through a socket from a separate process using Julius, but if it could be done with a cross-platform Python module that would be much more convenient.

That sounds interesting too. Iā€™d like to see how thatā€™s done.
Often with these sort of things, because everything needs to be contained within blender there are big problems every time you upgrade Blender (which seems to happen about 6 times a year). Itā€™d be better to just pipe data in and deal with it as dictionaries or lists or whatever.

Finally I succeded to make Google speech recognition work on windows:

Iā€™ll edit the post to explain the way to make it work and maybe somebody could make a build for Linux. Some settings are also needed in my .blend test file.

So apparently the Goole Speech API key is not needed contrarly to what I said.

Steps to install:

  • Download and install python 3.4.2 (The current version supported by Blender) from this site https://www.python.org/ (the default version is 32 bits. For 64 bits versions see all releases)
  • Download Speech Recognition 1.1.4 from this site: https://pypi.python.org/pypi/SpeechRecognition/
  • Create a SpeechRecognition folder on your Desktop for example
  • In this folder, extract speech_recognition and SpeechRecognition.egg-info from the speech recognition 1.1.4 archive youā€™ve just downloaded
  • If youā€™re on Windows, download and install PyAudio (32 or 64 bits according to your version of Python and Blender) from this site http://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio for python 3.4. Then go to your python installation directory. In Lib\site-packages, copy the 3 following files: pyaudio.py , _portaudio.pyd, portaudio_x64.dll (or portaudio_x32.dll) and paste it in your SpeechRecognition directory.
  • If youā€™re on Linux (tested on Linux Mint), first install portaudio19-dev from synaptic packages manager, then download pyaudio sources from http://people.csail.mit.edu/hubert/pyaudio/ and install ((you need also python 3.4 or the python version used by Blender) in the pyaudio directory, sudo python3.4 setup.py install). Get pyaudio.py and _portaudio.cpython-34m.so in /usr/local/lib/python3.4/dist-packages/)
  • Get the .blend I shared in the archive I shared in this post. And paste it in the SpeechRecognition folder.

Open it and make some tests. You can change threshold (100>10000), language (ā€œfr-FRā€ , ā€œen-ENā€, ā€œen-USā€ā€¦). You can also have a look at the API and examples here: https://pypi.python.org/pypi/SpeechRecognition/
You also have to change ā€œavanā€, ā€œreculā€ā€¦ according to your language.

It would be very cool if someone could make a build for Linux Mint or Ubuntu. That ressource will be more complete! Thanks!

It sounds great. Iā€™ll give it a try once Iā€™ve got some free time. Do you know if it can all be bundled in with an EXE file?

Sadly, Iā€™ve only got windows, so I canā€™t build it for Linux.

Thanks! Yes I think it can be bundled with a .exe file. Iā€™ll make a test to confirm :slight_smile:

Confirmed: You just have to put the files and directories (libraries needed to use PyAudio and speech recognition) in 2.73\Python\lib\

I think itā€™s possible with Microsoft speech recognition too.

Speech recognition 1.1.4 works also with Linux! The results are less successfull than on windows but itā€™s a matter of settings. Personnaly Iā€™ve tested it on Linux Mint 17.1 64 bits with 50% microphone amplification (Iā€™ve a desktop microphone but it should work better with other types of microphones) and some settings in the .blend you can see here:

In addition, I have to speak near the microphone to make it work correctly. And I receive sometimes messages from alsa audio but it works nevertheless.

If somebody wants to test it, iā€™m curious about any feedback/proposals to make it better.

Thanks!

If I get mirophone, I will test, thanks youle for linux versionā€¦

Can it recetlize languages like Latvian, russian or even more complicated, like chineese?

Hello! Donā€™t buy a microphone just for this. That works badly for the moment on Linux. I think all supported Languages are here:

http://www.mediafire.com/view/8d1qctvd3a4dym5/languages.jpg

EDIT: To remove alsa audio messages (Linux x64 version): http://stackoverflow.com/questions/7088672/pyaudio-working-but-spits-out-error-messages-each-time

Hi,youle, Thank you for this great resource.
Is it possible to make the cube talk instead? You input text and it speaks?

Hello! Thanks! With pyspeech (Microsoft speech recognition), it seems to be possible. Iā€™ll try to make an example. But with speech recognition 1.1.4 (Google), I found nothing in the API to do that. Perhaps another python modules can do that.

EDIT: Works fine! http://www.mediafire.com/download/i50cl6zaoiee7kf/text2speech.rar (pyspeech Windows 8 64ā€¦ If you want to run it on 32 bits just take the 32 bits files in the 1st post) (type whatever you want and return to say what you have written)(oups Iā€™ve written 2 lines too manyā€¦ updated .blend in the first post)

EDIT2: On Linux, you can install espeak from the synaptic package manager and call it from Blender in a thread.

Default testing code:

import subprocess
text = ā€˜ā€œHello worldā€ā€™
subprocess.call('espeak '+text, shell=True)

1 Like

Faaaaantastic Youle! Tu mā€™as sauvĆ©!
Is it possible now to pack voices and make some variations in pitch and speed?

Once again thank you, Iā€™ll check the speech module for more info!

Hi! I wanted to ask if someone has tried to make dialog system with this, so it recognizes the speech, decides the best answer and than answers(also a multiple AIs(evil and assistant). That system would make bery realistic games with realtime dialogues, where player should think much more for the right speechā€¦