Using Voice RSS to make a Raspberry Pi talk

Using Voice RSS to Make a Raspberry Pi Talk
Published: by
Last updated:
Categories
Raspberry Pi

Voice RSS is a text-to-speech API that can be used on a Raspberry Pi. Here we show how to use Voice RSS to make a Raspberry Pi talk.

Introduction

The basic idea behind TTS is to give a Raspberry Pi output in the form of vocal sentences instead of displaying text on the screen.

The instructions on this post will enable you to use the tts command in the form of tts "Hello, world." which will then be spoken.

Background

A while back I was experimenting with Steven Hickson’s PiAUISuite. It uses Google Speech to take basic vocal commands through a microphone, runs basic logic which can give output in various forms — speech being one of them. Google has changed to a paid speech-to-text service around November 2025 and the project was placed on hold.

Since I had PiAUISuite installed on a few Raspberry Pis, I thought to take its ‘tts’ command and show how to tweak it a little to at least have a good TTS ability using Voice RSS.

Sponsored Links

Please support us by using these links they come at no additional cost to you, but we get a little commission each time they are used. These and others are available from Amazon.com and BangGood.

We have tested most of these products ourselves. Other selection criteria include affordability, quality, availability and average user rating and popularity by other buyers.

Links will open in a new window.
(Dead or old links can be reported in the comments section below.)

iUniker Raspberry Pi 4B 4GB Starter Kit
iUniker Raspberry Pi 4B 4GB Starter Kit
2019 Model B, 4GB RAM
1.5GHz 64-bit quad-core CPU & 4GB RAM, Gigabit Ethernet, Wi-Fi, Bluetooth, USB 2.0/3.0, headphone jack and duel Micro HDMI (4K). Kit includes power supply, MicroSD, USB MicroSD card reader, a case with heatsink and fan, mini-HDMI cable and USB-C PiSwitch.
Available from Amazon.com
iUniker Raspberry Pi 4B 8GB Starter Kit
iUniker Raspberry Pi 4B 8GB Starter Kit
2019 Model B, 8GB RAM
1.5GHz 64-bit quad-core CPU & 8GB RAM, Gigabit Ethernet, Wi-Fi, Bluetooth, USB 2.0/3.0, headphone jack and duel Micro HDMI (4K). Kit includes power supply, MicroSD, USB MicroSD card reader, a case with heatsink and fan, mini-HDMI cable and USB-C PiSwitch.
Available from Amazon.com
Catda Raspberry Pi 4B Starter Kit
Catda Raspberry Pi 4B Starter Kit
2019 Model B, 4GB RAM
1.5GHz 64-bit quad-core CPU & 4GB RAM, Gigabit Ethernet, Wi-Fi, Bluetooth, USB 2.0/3.0, headphone jack and duel Micro HDMI (4K). Kit includes power supply, MicroSD, USB MicroSD card reader, a case with heatsink and fan, mini-HDMI cable and USB-C PiSwitch.
Available from BangGood

Assumptions / requirements

For this post, a fully installed Raspberry Pi Model B with the latest version of Raspbian was used. Default sound output from either the 3.5mm audio jack or HDMI cable needs to be audible.

During the installation process, a connection to the internet will be required. Without a screen, keyboard and mouse, PuTTY and/or WinSCP can be used to do the testing and coding.

Abasic, free Voice RSS account will allow up to 350 requests per day.

Sound output (ALSA at least). The latest Raspberry Pi B models have HDMI and a 3.5mm audio jack which can be used for sound. By default, Rasbian should have most things installed for ALSA to work.

Limitations

Although not always a limitation per se, this system is mainly controlled by running terminal commands. It is actually great for Python and Bash scripts. As mentioned above, a free Voice RSS account will only give you a maximum of 350 requests per day.

A modified version of PiAUISuite

PiAUISuite will install a nifty command called ‘tts. This is what we’ll be using after some modification. Start by installing some packages and then PiAUISuite from Github by typing the following on your terminal from a freshly booted Raspbian:

sudo apt-get install git
sudo apt-get install mpg123
git clone https://github.com/StevenHickson/PiAUISuite.git
cd PiAUISuite/Install
./InstallAUISuite.sh
cd /home/pi

Say yes to install the dependencies – so initially you’ll be saying yes twice. Afterwards, PiAUISuite will one by one try to install and set up playvideo, downloader, gvapi, gtextcommand, youtube, youtube-safe and voicecommand. We will only be needing the last one on the list, voicecommand – so say no for installing the rest.

After all the dependencies and voicecommand is installed (which can take a while), the installer will automatically prompt to set up voicecommand. On a fresh install, there will be no commands found and it will ask you to try to set itself up. We won’t be using this, so say no. (You can do this later by using voicecommand -s.

Next, we will be changing some code on the newly created tts file to use Voice RSS’s TTS service instead of Google’s.

To continue we will need a Voice RSS API key, so go and get one.

To edit the original tts file, use the following command from the Raspbian terminal:

sudo nano /usr/bin/tts

to edit the original code from Steven Hickson:

#!/bin/bash

#for the Raspberry Pi, we need to insert some sort of FILLER here since it cuts off the first bit of audio

string=$@
lang="en"
if [ "$1" == "-l" ] ; then
    lang="$2"
    string=`echo "$string" | sed -r 's/^.{6}//'`
fi

#empty the original file
echo "" > "/dev/shm/speak.mp3"

len=${#string}
while [ $len -ge 100 ] ;
do
    #lets split this up so that its a maximum of 99 characters
    tmp=${string:0:100}
    string=${string:100}

    #now we need to make sure there aren't split words, let's find the last space and the string after it
    lastspace=${tmp##* }
    tmplen=${#lastspace}

    #here we are shortening the tmp string
    tmplen=`expr 100 - $tmplen`
    tmp=${tmp:0:tmplen}

    #now we concatenate and the string is reconstructed
    string="$lastspace$string"
    len=${#string}

    #get the first 100 characters
    wget -q -U Mozilla -O "/dev/shm/tmp.mp3" "https://translate.google.com/translate_tts?tl=${lang}&q=$tmp&ie=UTF-8&total=1&idx=0&client=t"
    cat "/dev/shm/tmp.mp3" >> "/dev/shm/speak.mp3"
done
#this will get the last remnants
wget -q -U Mozilla -O "/dev/shm/tmp.mp3" "https://translate.google.com/translate_tts?tl=${lang}&q=$string&ie=UTF-8&total=1&idx=0&client=t"
cat "/dev/shm/tmp.mp3" >> "/dev/shm/speak.mp3"
#now we finally say the whole thing
cat "/dev/shm/speak.mp3" | mpg123 - 1>>/dev/shm/voice.log 2>>/dev/shm/voice.log

After getting your Voice RSS API, Keven recommended to replace the entire script with the following shorter version:

#!/bin/bash
#for the Raspberry Pi, we need to insert some sort of FILLER here since it cuts off the first bit of audio
string=$@
lang="en-gb"
if [ "$1" == "-l" ] ; then
    lang="$2"
    string=`echo "$string" | sed -r 's/^.{6}//'`
fi

#empty the original file
echo "" > "/dev/shm/speak.mp3"

len=${#string}
wget -q -U Mozilla -O "/dev/shm/tmp.mp3" "http://api.voicerss.org/?key=MYAPIKEYGOESHERE&src=$string&f=22khz_16bit_mono&hl=$lang"
cat "/dev/shm/tmp.mp3" >> "/dev/shm/speak.mp3"
#now we finally say the whole thing
cat "/dev/shm/speak.mp3" | mpg123 - 1>>/dev/shm/voice.log 2>>/dev/shm/voice.log

Simply replace MYAPIKEYGOESHERE with your own and exit (Ctrl + X & y) to save.

The command can now be used from any directory without sudo like so:

tts "Hello, world"

which will convert your text to speech, which can be heard on your default sound card and audio out. Voice RSS allows for up to 10 000 characters per call.

(If you’re having formatting troubles with the code above, just comment and I will get back to you to help.)

You can go through the Voice RSS documentation yourself and see what languages they have available, but I was happy with the quality of the default English voice.

Related products

Behind the Scenes is a free, informative website. If you find value in any of our content, please consider making a donation to our cause, or becoming a Patron for exclusive content.
Donate via PayPal Become a Patron

Save, share & discuss

Your comment is important, but don't be a knob. Keep it constructive and polite.

2 thoughts on “Using Voice RSS to make a Raspberry Pi talk”

  1. I think I figured out what the unexpected sounds are. The input -l es-mx is 8 characters, not 6. Voicerss expects all language identifiers to be 4 characters long. So this line of script: sed -r ‘s/^.{6}//’`
    is wrong. That ‘6’ should be an 8: sed -r ‘s/^.{8}//’`

  2. I’ve been using your code to speak in different languages. That means the if statement: [ “$1” == “-l” ] evaluates to true.

    To me it sounds like the first bit of audio is cut off in that case.
    There is a comment in the code that says some sort of filler needs to be inserted because of that cut off.

    The cut off doesn’t seem to be a problem unless -l is specified.

    I don’t understand what filler is getting inserted in the case where -l is not specified so I’m unable to reinsert that filler when language is specified.

    Can you clarify/explain why the cut off occurs when -l is specified and how I can fix that?

Leave a Reply

Your email address will not be published. Required fields are marked *

More Raspberry Pi related posts