Text to Speech
Converts text to natural-sounding speech audio using ElevenLabs AI's advanced voice synthesis technology.
Common Properties
- Name - The custom name of the node.
- Color - The custom color of the node.
- Delay Before (sec) - Waits in seconds before executing the node.
- Delay After (sec) - Waits in seconds after executing node.
- Continue On Error - Automation will continue regardless of any error. The default value is false.
Inputs
- Connection Id (String) - Connection ID from the Connect node. Optional if you provide API Key directly.
- Save Path (String) - File path where the generated audio will be saved (e.g., "output/speech.mp3").
- Voice ID (String) - ID of the voice to use for speech synthesis. Get voice IDs using the Get Voices node.
- Text (String) - The text content to convert to speech.
Options
- Stability (String) - Voice stability value from 0.0 to 1.0 (default: 0.5). Higher values make the voice more consistent and predictable.
- Similarity Boost (String) - Voice similarity boost from 0.0 to 1.0 (default: 0.75). Higher values make the voice more similar to the original voice sample.
- Model - Select the speech synthesis model:
- Eleven Monolingual v1 - Optimized for English language only
- Eleven Multilingual v1 - Supports multiple languages
- API Key - Your ElevenLabs AI API key. Optional if using Connection ID.
Outputs
This node does not have outputs. The audio file is saved to the specified path.
How It Works
The Text to Speech node generates audio from text using ElevenLabs AI. When executed, the node:
- Validates all required inputs (save path, voice ID, text)
- Checks that stability and similarity boost values are valid decimals between 0.0 and 1.0
- Either uses the provided connection or creates a new client with direct API key
- Calls the ElevenLabs text-to-speech API with the specified parameters
- Streams the generated audio chunks and saves them to the specified file path
Requirements
- Valid ElevenLabs API key (via Connect node or direct option)
- Valid voice ID from ElevenLabs
- Text to convert
- Writable file path for saving audio
- Stability and similarity boost must be decimal numbers between 0.0 and 1.0
Error Handling
The node will return specific errors in the following cases:
- Missing save path - "Save Path cannot be empty. Please specify a file path to save the audio."
- Missing voice ID - "Voice ID cannot be empty. Please provide a valid ElevenLabs voice ID."
- Missing text - "Text cannot be empty. Please provide the text to convert to speech."
- Missing stability - "Stability cannot be empty. Please provide a value between 0.0 and 1.0."
- Invalid stability format - "Stability must be a valid decimal number between 0.0 and 1.0."
- Missing similarity boost - "Similarity Boost cannot be empty. Please provide a value between 0.0 and 1.0."
- Invalid similarity boost format - "Similarity Boost must be a valid decimal number between 0.0 and 1.0."
- Generation failure - "Failed to generate speech: [error details]"
Usage Notes
- Stability controls voice consistency - higher values (0.7-1.0) create more predictable, steady voices; lower values (0.0-0.3) add more variation and emotion
- Similarity Boost controls how closely the output matches the original voice - higher values stay closer to the training samples
- The Monolingual v1 model is faster and more optimized for English content
- The Multilingual v1 model supports various languages but may be slightly slower
- Audio is streamed in chunks and written to disk, making it memory-efficient for long text
- The generated audio format depends on the ElevenLabs API default (typically MP3)
- Make sure the save path directory exists before running the node
- Voice IDs can be obtained from the Get Voices or Get Voice nodes
Example Use Cases
- Creating voiceovers for videos or presentations
- Generating audio narration for articles or blog posts
- Building voice assistants and chatbots
- Creating audiobooks from text content
- Generating automated phone system messages
- Converting product descriptions to audio for accessibility