Extract Text

Extracts all text content from HTML elements, removing HTML tags and returning plain text.

Common Properties

Name - The custom name of the node.
Color - The custom color of the node.
Delay Before (sec) - Waits in seconds before executing the node.
Delay After (sec) - Waits in seconds after executing node.
Continue On Error - Automation will continue regardless of any error. The default value is false.

info

If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.

This node does not have any options.

The Extract Text node parses HTML content and extracts all text content while removing HTML tags and markup. When executed, the node:

The node will return specific errors in the following cases:

Extracts text from all elements within the provided HTML, including nested elements
Removes all HTML tags, scripts, and styles
Preserves text content and basic whitespace
Does not preserve the original formatting or structure of the HTML
Useful for cleaning HTML content to get plain text for further processing
Can handle complex HTML structures with multiple nested elements
The extracted text maintains the order of content as it appears in the HTML
Works with partial HTML fragments as well as complete HTML documents
Useful for web scraping, content analysis, and text processing tasks
The output text can be further processed by other text manipulation nodes
Does not decode HTML entities - they will appear as-is in the output