Text To Speech

Converts text to speech audio using Google Cloud Text-to-Speech API.

Common Properties

Name - The custom name of the node.
Color - The custom color of the node.
Delay Before (sec) - Waits in seconds before executing the node.
Delay After (sec) - Waits in seconds after executing node.
Continue On Error - Automation will continue regardless of any error. The default value is false.

info

If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.

Text - The text content to convert to speech. Supports both plain text and SSML (Speech Synthesis Markup Language) formats.
Path - The file path where the generated audio file will be saved. If empty, the file will be saved to a temporary path.

Audio Encoding - The audio file format for the generated speech:
- Wav - Uncompressed WAV audio format
- Mp3 - Compressed MP3 audio format
- Ogg Opus - Compressed OGG Opus audio format
Language Code - The BCP-47 language code of the desired language for the speech output. Default is en-US. See Google Cloud Text-to-Speech documentation for supported languages.
Voice Name - The name of the voice to be used for speech synthesis. Default is en-US-Studio-M. See Google Cloud Text-to-Speech documentation for available voices.
Sample Rate - The sample rate (in Hertz) for the generated audio. Default is 16000.
Credentials - Google Cloud credentials used to authenticate with the Text-to-Speech API.
SSML Text - If enabled, indicates that the input text is in SSML format rather than plain text.

The Text to Speech node converts text to audio using Google Cloud Text-to-Speech API. When executed, the node:

Validates the provided text input and file path
Authenticates with Google Cloud Text-to-Speech API using the provided credentials
Configures the synthesis parameters (language, voice, encoding, sample rate, etc.)
Processes the text using either plain text or SSML input
Generates the audio file and saves it to the specified path
Returns the path to the generated audio file

The node will return specific errors in the following cases:

The Text input supports both plain text and SSML formats for enhanced speech control
When using SSML, enable the "SSML Text" option to ensure proper processing
Different language codes support different voices and features
Voice names determine the specific voice characteristics (gender, tone, accent)
Audio encoding affects file size and quality:
- WAV provides highest quality but largest file size
- MP3 provides good quality with smaller file size
- OGG Opus provides excellent quality with efficient compression
The Sample Rate affects audio quality and file size
If no Path is specified, the audio file is saved to a temporary location
The output Path provides the location of the generated audio file for further processing
Supported languages and voices can be found in the Google Cloud Text-to-Speech documentation