The SpeechRecognition Package
The SpeechRecognition
package provides a high-level interface to record and process audio inputs in Python.
Reference:
Prerequisites
This package depends on another Python package called "pyaudio", which itself depends on a lower-level library caled "portaudio" (not a Python package). To install "portaudio":
On a Mac, use homebrew (
brew install portaudio
).On Windows, use pipwin within an active virtual environment (see installation steps below).
Installation
Do these installation steps after activating a virtual environment.
Windows:
pip install pipwin
pipwin install pyaudio # will install along with lower level binaries
pip install SpeechRecognition # depends on the "pyaudio" Python package
Mac:
pip install pyaudio
pip install SpeechRecognition
Usage
Recording Audio
Record audio using your computer's built-in microphone, and save that to a file:
import speech_recognition as sr
client = sr.Recognizer()
with sr.Microphone() as mic:
print("Say something!")
audio = client.listen(mic)
with open("my-recording.flac", "wb") as f:
f.write(audio.get_flac_data())
Recognizing Speech
Record audio using your computer's built-in microphone, and recognize the spoken words:
import speech_recognition as sr
client = sr.Recognizer()
with sr.Microphone() as mic:
print("Say something!")
audio = client.listen(mic)
# returns the transcript with the highest confidence:
transcript = client.recognize_google(audio)
#> 'how old is the Brooklyn Bridge'
# returns all transcripts:
response = client.recognize_google(audio, show_all=True)
#> {
#> 'alternative': [
#> {
#> 'transcript': 'how old is the Brooklyn Bridge',
#> 'confidence': 0.987629
#> }
#> ],
#> 'final': True
#> }
Last updated
Was this helpful?