Amazon Polly TTS alternative approach

Like many of you, I’ve struggled to find a decent TTS engine for Sonos after Microsoft ended their TTS service. As a parent of toddlers, I quickly noticed I wasn’t getting notified whenever the kids opened the doors and left the house.

I went looking for a fast, high-quality service and I settled on Amazon Polly. It’s free for the first year and then $4 per million characters after that. Given my use cases are limited to door and gate notifications I expect to pay less than $4/year after the first year.

This solution is external to Vera, but I’ve found it works well and integrates nicely with the VeraAlerts plugin that I was already using.

It uses a Rasberry Pi and a Node.js module named: Sonos HTTP API Sonos HTTP API by jishi

This approach completely bypasses the Vera Sonos plug-in but the two play nicely with one another. In VeraAlerts I set up a new “URL” alert and used the syntax: http://RaspberryPi/sayall/{Msg}

As a bonus the Sonos HTTP API locally caches, so routine announcements like door notifications start playing almost instantly.

Hope this info helps someone.

Hi

Is there a way for that service to be called by Vera itself (scene/lua code/api) - so you can send the required text in the required format to your PI for that node.js app to play it via your Sonos ?? E.g…

luup.inet.wget("http://raspberrypi/sayall/{this+is+my+message+Msg}")
I went looking for a fast, high-quality service and I settled on Amazon Polly.

I’d like to say a big “Thank you!” for this info; it took me a couple of hours of not particularly hard work; I setup an AWS account, configured it for credentialed access to Polly, added nodejs and npm 8.x to my Raspbian Jessie Pi, configured the api node, and have since converted all of my spoken system messages to use Polly’s “Joanna” voice. Works great! Thank you! :smiley:

[quote=“parkerc, post:2, topic:196611”]Hi

Is there a way for that service to be called by Vera itself (scene/lua code/api) - so you can send the required text in the required format to your PI for that node.js app to play it via your Sonos ?? E.g…

luup.inet.wget("http://raspberrypi/sayall/{this+is+my+message+Msg}")

I’ve had issues with luup.inet.wget() – it seems quirky about how it receives input (urlencoded or not) so I ended up using the socket.http method instead which worked out great:

local code = 404
local _, code, _ = http.request(“http://192.168.1.xxx:5050/sayall/This is a test of Amazon Polly/Joanna/45”)
– note, polly default port is 5050

Good luck! :wink:

Many Thanks @kapstaad

Looking at your code, is it ok for the text part of the http.request to include spaces - does it not need to use ‘+’ instead ?

Also I followed your earlier link to the software’s website but could not find anything about the URL / POST structure to invoke TTS externally (e.g from Vera) - would you be able to direct me to exactly where that section is?

I did something similar. using polly, but instead of cloud-rendering on the fly, i wrote a cache system, so after is speaks a phrase, it’s cached and no re-rendering is needed.
it’s not like saying “the garage door is open” is going to change. so why render every time. cache and then play the cached file.
I do really like the whole aws and polly system. its well developed and has a ton of options.

[quote=“mvader, post:6, topic:196611”]I did something similar. using polly, but instead of cloud-rendering on the fly, i wrote a cache system, so after is speaks a phrase, it’s cached and no re-rendering is needed.
it’s not like saying “the garage door is open” is going to change. so why render every time. cache and then play the cached file.
I do really like the whole aws and polly system. its well developed and has a ton of options.[/quote]
Would you be willing to share more details about how you did this or even some of the code? It sounds interesting. Thanks!

Found this online as a way to create standalone (Polly created) mp3s.

[quote=“parkerc, post:5, topic:196611”]Many Thanks @kapstaad

Looking at your code, is it ok for the text part of the http.request to include spaces - does it not need to use ‘+’ instead ?[/quote]

The text part can contain spaces if you’re using http.request; it works, so I’d guess (too lazy/busy to check docs :P) that http.request does it’s own urlencoding internally.

Also I followed your earlier link to the software's website but could not find anything about the URL / POST structure to invoke TTS externally (e.g from Vera) - would you be able to direct me to exactly where that section is?

Documentation for the Node.js widget I’m using is here:

[url=https://github.com/jishi/node-sonos-http-api#say-tts-support]GitHub - jishi/node-sonos-http-api: An HTTP API bridge for Sonos easing automation. Hostable on any node.js capable device, like a raspberry pi or similar.

Here’s the lua code I’m using:

function Polly(phrase, vol, zone, voice)
  
  local http    = require "socket.http"; socket.http.TIMEOUT = 3
  local server  = "192.168.1.xxx" -- change to your own Vera device local IP
  local port    = "5005"  
  
  if(voice == nil) then voice = "Joanna" end
  if(vol == nil) then vol = 50 end
  if((zone == nil) or (zone == "ALL")) then zone = "sayall" else zone = zone .. "/say" end
  local polly = "http://" .. server .. ":" .. port
  local purl  = polly .. "/" .. zone .. "/" .. phrase .. "/" .. voice .. "/" .. vol
  
  local _, code, _ = http.request(purl)
  
  if(code ~= 200) then
    writelog("Call to Polly failed: result code (" .. code .. ")")
  end
  
end

example:

Polly("This is a test of Polly, the Amazon speech engine.", 60") -- plays in all zones at 60% volume

when called, the function assembles the following URL, and calls it:

http://192.168.1.xxx:5005/sayall/This is a test of Polly, the Amazon speech engine./Joanna/60

Hope this helps :slight_smile:

I have this working now. Running the node-sonos-http-api in a Docker container on my QNAP NAS.

I do like the caching feature, makes repeat standard announcements very snappy. The Polly voices are a bit underwhelming considering there’s a similar technology underpinning Alexa.

One handy tip for using VeraAlerts with the URL profile. Make sure you delete the “undefined” field for MessagePrefix. I was getting nowhere with VeraAlerts triggering the sonos api until I specifically blanked that field.

Hi @Spanners

I like the idea of running it as Docker and I have a QNAP too…

To help me, would you be able to share the ‘Advanced Settings’ that you’ve used ? E.g Do you run it NAT, Bridge, Host ? I’m not too familiar with Container Statation.

Update : I’ve installed it under network settings ‘NAT’, which makes the service available via “192.168.1.111:32768” which I think then forward to port 5005 based on what the console is showing. Not sure that is correct…

Update2 : Ok found out it will not work using NAT, so have changed to ‘Bridge’ which allows it get its own IP and use port 5005. (I understand this is because it requires multicast traffic as well as inbound and outbound connections)

@Spanners - where do you go to add/edit the required files set up Polly with the Docker? E.g the Location of ‘settings.json’?

Hi @parkerc

I have the node-sonos-http-api running in a container using Bridge networking. I have ha-bridge running in another container on Host, but plan to move it to Bridge when I get the time.

To manipulate the files in the containers, I didn’t find an easy way to do it (QNAP dropped the ball a bit here I feel with Container Station). What you need to do is SSH into your QNAP and use these commands:

docker ps

That will give you the unique ID of the containers running. Then use

docker cp

to copy files in and out of the container. Recommend using the Public share on your QNAP, as containers have limited access. I used:

docker cp dccd7fa3c88e:/app/settings.json /share/Public/
docker cp /share/Public/settings.json dccd7fa3c88e:/app/

To copy the file out, edit it on the share and copy it back in. I guess you could use the advanced container settings to mount a volume you can externally access or something, but haven’t explored it.

Thanks so much @spanners

I have ha bridge and Home Assistant dockers running in ‘bridge’ mode and ‘host’ respectively. Containers just seem a great way to add extras functionality (but the hosting technology is not easy to get your head around)

FYI - I did a file search last night across my entire Qnap for key files e.g server.js (in the hope I might find the location where the source files for this particular docker are kept) - and I think I may have found something…

find . -name server.js

It took me to to location that i seem to recall had been set up as my virtual container/VM storage location, which was a share called… Backups/VM/

share/Backups/VM/container_station_data/lib/docker/devicemapper/mnt/08abcderfghitygh45444665ggf/rootfs/app/

… I edited the index.html file in the /static/ subfolder of the above and that updated the Sonos Http API web page that you access via http://Your_IP:5005

I would be interested to here if you think that is the location where the Sonos http api docker’s install files are stored?

Also please could you share a redacted copy of your settings.json file ?

Success a Sunday morning well spent.

I have Polly up and running; I’m able to upload files directly to the container via the location I mentioned above. And rather than use lua code to construct the required URL; as the app knows what to do, the following seems to do the job (The IP is the one for your sonos-http-api server/Docker)

This is for all zones

http://192.168.1.167:5005/sayall/This%20is%20a%20test%20of%20Polly,%20the%20Amazon%20speech%20engine./Joanna/60

This is for just one zone (kitchen)

192.168.1.167:5005/kitchen/say/This%20is%20a%20test%20of%20Polly,%20the%20Amazon%20speech%20engine./Joanna/60

My settings.json is here

{ "aws": { "credentials": { "region": "eu-west-1", "accessKeyId": "ADRT14NOTMINE1727/23", "secretAccessKey": "idnenshdbhebxiukdbeanyrhing/ezuT7ZSag" }, "name": "Joanna" } }

Which is placed here…

./share/CACHEDEV1_DATA/Backups/VM/container-station-data/lib/docker/devicemapper/mnt/994asastart08abcderfghitygh45444665ggf/rootfs/app/settings.json

Thank you so much for posting your Amazon Polly credentials so we can can use it for free. 8)

Seriously though, you may want to blank those out ASAP to avoid possible abuse.

It’s ok @BOFH, I wouldn’t do that. they are just dummy credentials :slight_smile:

You’ll notice the word ‘Not Mine’ hidden in the access key ID

However, if it still works - BONUS !!

Nice one! I missed that…

Hi @kapstaad

Using your code as a global function, I can’t seem to get it to play in just one zone (my office) ? Can you ? The code I used is as follows.

Polly("This is a test of Polly, the Amazon speech engine.", 60, Office") 

I thought this should play in my office at 60% volume, but it’s not. Everything I’ve tried seems to default the http.request URL to “/sayall/” rather than /office/say/ ?

[quote=“parkerc, post:18, topic:196611”]Hi @kapstaad

Using your code as a global function, I can’t seem to get it to play in just one zone (my office) ? Can you ? The code I used is as follows.

Polly("This is a test of Polly, the Amazon speech engine.", 60, Office") 

I thought this should play in my office at 60% volume, but it’s not. Everything I’ve tried seems to default the http.request URL to “/sayall/” rather than /office/say/ ?[/quote]

Two possibilities; first, do you definitely have a Sonos unit called “Office”?

Second, it might be as simple as a typo… you’re missing a quote mark before the word “Office” in the code snippet:

Polly("This is a test of Polly, the Amazon speech engine.", 60, "Office") 
                                                                                      ^-------------------this quote mark is missing in your example.

Good luck!

Thanks @kapstaad

That was it, the name of the zone needed to be in quotes - more examples below :slight_smile:

[code]Polly(“This is a test of Polly, the Amazon speech engine.”, 60, “office”)


Polly(“This is a test of Polly, the Amazon speech engine.”, 60, “living room”)


Polly(“This is a test of Polly, the Amazon speech engine.”, 60, “kitchen”)[/code]

What I have noticed as I walked through each of the above - is that the first request made to a zone, Polly does not completely say the text submitted. But If I try it again it does.

Have you noticed this as you switch between using specific zones one at a time ? ?