Skip to main content

Moderate

Check text content for policy violations using OpenAI's moderation models to detect harmful or inappropriate content.

Common Properties

  • Name - The custom name of the node.
  • Color - The custom color of the node.
  • Delay Before (sec) - Waits in seconds before executing the node.
  • Delay After (sec) - Waits in seconds after executing node.
  • Continue On Error - Automation will continue regardless of any error. Default: false.

Inputs

  • Connection Id - Connection identifier from Connect node.
  • Text - Text content to check for policy violations.
  • Use Robomotion AI Credits - Use Robomotion credits instead of your own API key.

Options

  • Model - Moderation model:
    • Omni Moderation (Latest) - Most accurate, multimodal (default)
    • Omni Moderation (2024-09-26) - Specific version
    • Text Moderation (Latest) - Text-only, auto-updates
    • Text Moderation (Stable) - Text-only, stable version
  • Include Raw Response - Include category scores. Default: false.

Outputs

  • Flagged - Boolean indicating if content was flagged for any category.
  • Categories - Map of category names to flagged status (true/false).
  • Raw Response - Full response with category scores (when enabled).

How It Works

Analyzes content for policy violations:

  1. Validates connection and input text
  2. Sends text to moderation model
  3. Checks against policy categories
  4. Returns flagged status and categories

Moderation Categories

The model checks for these categories:

  • hate - Hateful content
  • hate/threatening - Hateful content with threats
  • harassment - Harassment
  • harassment/threatening - Harassment with threats
  • self-harm - Self-harm content
  • self-harm/intent - Intent to self-harm
  • self-harm/instructions - Instructions for self-harm
  • sexual - Sexual content
  • sexual/minors - Sexual content involving minors
  • violence - Violent content
  • violence/graphic - Graphic violent content

Usage Examples

Example 1: Check User-Generated Content

Input:
- Text: "This is a normal, safe comment."
- Model: omni-moderation-latest

Output:
- Flagged: false
- Categories: {
"hate": false,
"harassment": false,
"sexual": false,
"violence": false,
...
}

Example 2: Detect Inappropriate Content

Input:
- Text: "[inappropriate content]"
- Model: omni-moderation-latest

Output:
- Flagged: true
- Categories: {
"hate": true,
"harassment": false,
...
}

Example 3: With Category Scores

Input:
- Text: "Borderline content"
- Include Raw Response: true

Output:
- Flagged: false
- Raw Response: {
category_scores: {
"hate": 0.02,
"violence": 0.01,
...
}
}

Use Cases

  • Content Moderation: Filter user-generated content before publishing
  • Chat Safety: Monitor chatbot conversations for inappropriate content
  • Form Validation: Check form submissions for policy violations
  • Compliance: Ensure content meets community guidelines

Requirements

  • Connection Id from Connect node
  • Non-empty text to check

Tips for RPA Developers

  • Preventive: Use before sending content to AI models or publishing user content
  • Category-Specific: Check specific categories based on your use case
  • Thresholds: Raw response includes scores (0-1) for fine-tuned thresholds
  • Speed: Very fast, suitable for real-time moderation
  • Privacy: OpenAI does not use moderation API data for training
  • Automation: Integrate with content workflows to auto-reject flagged content

Common Errors

"Text cannot be empty"

  • Provide text content to check