Skip to main content

Chat Completion

Generates chat responses with conversation history support, enabling multi-turn conversational AI applications through OpenRouter.

Common Properties

  • Name - The custom name of the node.
  • Color - The custom color of the node.
  • Delay Before (sec) - Waits in seconds before executing the node.
  • Delay After (sec) - Waits in seconds after executing node.
  • Continue On Error - Automation will continue regardless of any error. The default value is false.
info

If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.

Inputs

  • Connection Id - The connection identifier from Connect node (optional if API Key is provided directly).
  • System Prompt - System instructions to guide the AI assistant behavior. Default: "You are a helpful assistant."
  • User Prompt - The current user message or question to send to the AI model. Required and cannot be empty.
  • Chat History - Array of previous conversation messages for context. Format: [{role: "user", content: "..."}, {role: "assistant", content: "..."}]

Options

Authentication

  • API Key - OpenRouter API key credential (optional if using Connection Id). Allows using the node without Connect.
  • Use Robomotion AI Credits - Use Robomotion AI credits instead of your own API key.

Model Selection

  • Model - Select which AI model to use. Options include:
    • Gemini 2.5 Flash - Fast and efficient (default)
    • Gemini 2.5 Pro - Advanced capabilities
    • Gemini 3 Pro Preview - Latest Gemini preview
    • Claude Sonnet 4 - Balanced performance
    • Claude Sonnet 4.5 - Latest Sonnet version
    • Claude Opus 4 - Highly capable
    • Claude Opus 4.5 - Most capable Claude
    • GPT-4.1 - Latest GPT-4 generation
    • GPT-4.1 Mini - Faster GPT-4
    • GPT-5 - Latest GPT generation
    • GPT-5 Mini - Faster GPT-5
    • o3 - OpenAI reasoning model
    • o3 Mini - Smaller reasoning model
    • o4 Mini - Latest mini reasoning model
    • Grok 4 - xAI's Grok model
    • Grok 4.1 - Latest Grok version
    • DeepSeek Chat v3 - DeepSeek model
    • Custom Model - Specify any OpenRouter model
  • Custom Model - Custom model identifier when "Custom Model" is selected (e.g., "meta-llama/llama-3.3-70b-instruct").

Generation Settings

  • Number of Generations - Generate 1-4 different responses in a single request. Default: 1
  • Stream - Enable streaming for real-time token generation. Default: false
  • JSON Mode - Force the model to output valid JSON. Default: false
  • Temperature - Sampling temperature (0.0-2.0). Higher values make output more random. Default: model default
  • Top P - Nucleus sampling (0.0-1.0). Alternative to temperature for controlling randomness. Default: model default
  • Max Tokens - Maximum number of tokens to generate. Default: model default
  • Stop Sequences - Comma-separated sequences where generation will stop (e.g., "END,STOP").

Reasoning Mode

  • Reasoning Mode - Enable reasoning/thinking mode for compatible models (Claude, o-series, Gemini, Grok):
    • Off - No reasoning mode (default)
    • Low - Minimal reasoning effort
    • Medium - Moderate reasoning effort
    • High - Maximum reasoning effort

Structured Output

  • Response Schema - JSON schema for structured output. Requires JSON Mode to be enabled.

Advanced

  • Seed - Random seed for reproducible outputs.
  • Timeout (seconds) - Request timeout in seconds. Default: 120
  • Include Raw Response - Include full API response in output. Default: false

Outputs

  • Text - Generated response text. Returns a string if single generation, or an array of strings if multiple generations.
  • Chat History - Updated conversation history including the new exchange. Excludes system messages for cleaner history management.
  • Raw Response - Complete API response object (only when "Include Raw Response" is enabled).

How It Works

When executed, the node:

  1. Validates the connection or creates a temporary client using provided credentials
  2. Prepares the system prompt (defaults to "You are a helpful assistant" if empty)
  3. Validates that the user prompt is not empty
  4. Processes the input Chat History array (if provided)
  5. Builds the complete messages array:
    • Starts with the system message
    • Adds all history messages (maintains conversation context)
    • Appends the new user message
  6. Configures all generation parameters (streaming, JSON mode, reasoning, etc.)
  7. Makes the API request
  8. For non-streaming: Extracts text from response
  9. For streaming: Reads tokens in real-time and assembles complete response
  10. Updates the Chat History with the assistant's response
  11. Returns the text, updated history, and optional raw response

Chat History Format

The Chat History input and output use this format:

[
{
role: "user",
content: "What is RPA?"
},
{
role: "assistant",
content: "RPA stands for Robotic Process Automation..."
},
{
role: "user",
content: "How does it work?"
},
{
role: "assistant",
content: "RPA works by automating repetitive tasks..."
}
]

Notes:

  • System messages are automatically added and excluded from output history
  • Only user and assistant roles appear in the output history
  • History is cumulative - each response adds to it

Requirements

  • Either a valid Connection Id from Connect node OR direct API Key credentials
  • Non-empty User Prompt

Error Handling

The node will return specific errors in the following cases:

  • Empty or missing User Prompt
  • Invalid Connection Id (when not using direct credentials)
  • Empty Custom Model name when Custom Model is selected
  • Invalid Chat History format
  • API authentication errors (401)
  • API rate limit errors (429)
  • Model not found errors (404)
  • API service errors (500, 502, 503, 504)
  • Request timeout errors
  • Streaming connection errors

Usage Notes

Conversation Management

  • Chat History maintains context across multiple turns
  • Feed the Chat History output back as input for the next turn
  • System prompt is applied to every turn but not included in history output
  • History grows with each turn - consider truncating for very long conversations

Streaming vs Non-Streaming

  • Non-streaming: Waits for complete response, then returns all text at once
  • Streaming: Receives tokens in real-time, assembles complete text before returning
  • Streaming provides better user experience but currently returns complete text (not progressive)

Context Window Limits

  • Different models have different context window sizes
  • Long conversations may exceed the context window
  • Consider truncating old history or summarizing past conversations
  • Monitor token usage to stay within limits

Multi-Turn Conversations

  • Always pass the previous Chat History output to the next node
  • Store history in message scope (e.g., msg.chat_history)
  • Each turn builds upon previous context

Examples

Example 1: Simple Two-Turn Conversation

Turn 1:

  • Connection Id: msg.connection
  • User Prompt: "What is OpenRouter?"
  • Chat History: (empty or not provided)

Output 1:

  • Text: "OpenRouter is a unified API gateway that provides access to multiple AI models..."
  • Chat History: [{role: "user", content: "What is OpenRouter?"}, {role: "assistant", content: "OpenRouter is a unified API gateway..."}]

Turn 2:

  • Connection Id: msg.connection
  • User Prompt: "What models does it support?"
  • Chat History: (output from Turn 1)

Output 2:

  • Text: "Based on our previous discussion about OpenRouter, it supports over 480+ models including..."
  • Chat History: (now has 2 exchanges)

Example 2: Customer Support Chatbot

System Prompt: "You are a helpful customer support agent for an e-commerce company. Be friendly and helpful."

Turn 1:

  • User Prompt: "I haven't received my order yet"
  • Chat History: []

Turn 2:

  • User Prompt: "Order number is #12345"
  • Chat History: (from Turn 1)

Turn 3:

  • User Prompt: "When will it arrive?"
  • Chat History: (from Turn 2)

The model maintains context throughout, remembering the order number and issue.


Example 3: Building a Loop-Based Chatbot

// Initialize
msg.chat_history = [];
msg.user_message = "Hello, I need help with automation";

// In a loop (could be triggered by user input):
while (msg.user_message !== "bye") {
// Chat Completion Node
// Input: msg.user_message, msg.chat_history
// Output: msg.response, msg.chat_history

// Display response to user
console.log(msg.response);

// Get next user input
msg.user_message = getUserInput();
}

This pattern creates a continuous conversation loop.


Example 4: Multi-Language Support

System Prompt: "You are a multilingual assistant. Respond in the same language the user uses."

Turn 1:

  • User Prompt: "Hola, ¿cómo estás?"
  • Output: "¡Hola! Estoy bien, gracias. ¿En qué puedo ayudarte hoy?"

Turn 2:

  • User Prompt: "Can you switch to English?"
  • Output: "Of course! I'm happy to continue in English. How can I help you?"

The model adapts to language changes while maintaining context.


Example 5: JSON Mode with Conversation

System Prompt: "Extract key information from the conversation and return it as JSON."

JSON Mode: true Response Schema:

{
"type": "object",
"properties": {
"user_intent": {"type": "string"},
"entities": {"type": "array", "items": {"type": "string"}},
"sentiment": {"type": "string"}
}
}

Turn 1:

  • User Prompt: "I want to book a flight to Paris next week"

Output:

{
"user_intent": "book_flight",
"entities": ["Paris", "next week"],
"sentiment": "neutral"
}

Useful for extracting structured data from conversations.


Example 6: Streaming Chat for Responsive UI

Configuration:

  • Stream: true
  • Model: GPT-4.1

Usage: Although streaming assembles the complete text before returning, it's useful for:

  • Long responses that might timeout without streaming
  • Better server-side handling of large outputs
  • Future progressive output capabilities

Example 7: Conversation History Management

To prevent context window overflow:

// Keep only last 10 messages
if (msg.chat_history.length > 10) {
msg.chat_history = msg.chat_history.slice(-10);
}

// Or summarize old messages
if (msg.chat_history.length > 20) {
// Use a separate Generate Text node to summarize
// Keep summary + recent messages
}

Example 8: Multi-User Chat Management

For handling multiple users:

// Store separate histories per user
msg.user_histories = msg.user_histories || {};
msg.current_user_id = "user123";

// Get user's history
msg.chat_history = msg.user_histories[msg.current_user_id] || [];

// After Chat Completion
msg.user_histories[msg.current_user_id] = msg.chat_history;

Best Practices

  1. History Management:

    • Store Chat History in message scope for easy access
    • Always pass previous history to maintain context
    • Truncate very long histories to avoid context window limits
    • Consider summarizing old conversations for very long sessions
  2. System Prompt Design:

    • Use System Prompt to set the assistant's personality and role
    • Keep system prompts concise but clear
    • Update system prompts for different conversation phases if needed
  3. Model Selection:

    • Use Gemini 2.5 Flash for fast, cost-effective chat
    • Use Claude Sonnet 4.5 for high-quality conversations with long context
    • Use GPT models for consistency and reliability
    • Test different models for your specific use case
  4. Error Handling:

    • Handle network errors and timeouts gracefully
    • Implement retry logic for failed requests
    • Provide fallback responses when API is unavailable
    • Log errors with conversation context for debugging
  5. Performance Optimization:

    • Use streaming for long responses
    • Set appropriate Max Tokens to control response length
    • Consider caching common responses
    • Monitor token usage to manage costs
  6. Conversation Flow:

    • Design clear conversation paths
    • Handle off-topic user inputs gracefully
    • Implement conversation reset functionality
    • Provide exit commands for users
  7. Multi-Turn Context:

    • Use conversation history to avoid repeating information
    • Reference previous messages naturally
    • Maintain consistent personality across turns
    • Handle topic changes smoothly
  8. Production Deployment:

    • Implement proper user session management
    • Store conversation histories in databases for persistence
    • Add conversation analytics and monitoring
    • Implement rate limiting per user
    • Add content filtering for safety