Skip to main content

Count Words

Calculates the frequency of words in a text and returns statistics including count and ratio for each word.

Common Properties

  • Name - The custom name of the node.
  • Color - The custom color of the node.
  • Delay Before (sec) - Waits in seconds before executing the node.
  • Delay After (sec) - Waits in seconds after executing node.
  • Continue On Error - Automation will continue regardless of any error. The default value is false.
info

If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.

Inputs

  • Text - The input text string to analyze for word frequency.

Options

This node does not have any options.

Output

  • Result - An array of objects containing word frequency statistics, sorted by count in descending order. Each object includes:
    • word - The normalized word (lowercase)
    • count - The number of occurrences of the word
    • ratio - The percentage of the word relative to the total word count

How It Works

The Count Words node analyzes a text input and calculates the frequency of each word. When executed, the node:

  1. Retrieves the Text input variable
  2. Validates that the text is not empty
  3. Extracts words from the text using a regular expression [\p{L}\p{N}_]+ which matches:
    • Letters from any language (\p{L})
    • Numbers (\p{N})
    • Underscore characters (_)
  4. Checks that at least one word was found in the text
  5. Normalizes all words to lowercase to ensure consistent counting
  6. Creates a frequency map counting occurrences of each normalized word
  7. Calculates the ratio of each word as a percentage of the total word count
  8. Converts the frequency map to an array of CountWordsData objects
  9. Sorts the array by word count in descending order
  10. Sets the sorted array as the output variable

Requirements

  • A non-empty text input string
  • The text must contain at least one word that matches the word pattern

Error Handling

The node will return specific errors in the following cases:

  • Empty or invalid Text input - "Text input cannot be empty"
  • No words found in the input text - "No words found in the input text"

Usage Notes

  • Words are normalized to lowercase for consistent counting
  • The regular expression [\p{L}\p{N}_]+ is used to extract words, which supports international characters and numbers
  • Punctuation and special characters are ignored during word extraction
  • The output is sorted by word count in descending order, with the most frequent words appearing first
  • Each word's ratio is calculated as (word count / total words) * 100 and formatted to 2 decimal places
  • Useful for text analysis, keyword extraction, and content analysis tasks
  • The node can handle texts in various languages that use the supported character sets
  • The output array can be processed by subsequent nodes for further analysis or filtering