Chat Completion
Generates chat responses with conversation history support, enabling multi-turn conversational AI applications through OpenRouter.
Common Properties
- Name - The custom name of the node.
- Color - The custom color of the node.
- Delay Before (sec) - Waits in seconds before executing the node.
- Delay After (sec) - Waits in seconds after executing node.
- Continue On Error - Automation will continue regardless of any error. The default value is false.
If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.
Inputs
- Connection Id - The connection identifier from Connect node (optional if API Key is provided directly).
- System Prompt - System instructions to guide the AI assistant behavior. Default: "You are a helpful assistant."
- User Prompt - The current user message or question to send to the AI model. Required and cannot be empty.
- Chat History - Array of previous conversation messages for context. Format:
[{role: "user", content: "..."}, {role: "assistant", content: "..."}]
Options
Authentication
- API Key - OpenRouter API key credential (optional if using Connection Id). Allows using the node without Connect.
- Use Robomotion AI Credits - Use Robomotion AI credits instead of your own API key.
Model Selection
- Model - Select which AI model to use. Options include:
- Gemini 2.5 Flash - Fast and efficient (default)
- Gemini 2.5 Pro - Advanced capabilities
- Gemini 3 Pro Preview - Latest Gemini preview
- Claude Sonnet 4 - Balanced performance
- Claude Sonnet 4.5 - Latest Sonnet version
- Claude Opus 4 - Highly capable
- Claude Opus 4.5 - Most capable Claude
- GPT-4.1 - Latest GPT-4 generation
- GPT-4.1 Mini - Faster GPT-4
- GPT-5 - Latest GPT generation
- GPT-5 Mini - Faster GPT-5
- o3 - OpenAI reasoning model
- o3 Mini - Smaller reasoning model
- o4 Mini - Latest mini reasoning model
- Grok 4 - xAI's Grok model
- Grok 4.1 - Latest Grok version
- DeepSeek Chat v3 - DeepSeek model
- Custom Model - Specify any OpenRouter model
- Custom Model - Custom model identifier when "Custom Model" is selected (e.g., "meta-llama/llama-3.3-70b-instruct").
Generation Settings
- Number of Generations - Generate 1-4 different responses in a single request. Default: 1
- Stream - Enable streaming for real-time token generation. Default: false
- JSON Mode - Force the model to output valid JSON. Default: false
- Temperature - Sampling temperature (0.0-2.0). Higher values make output more random. Default: model default
- Top P - Nucleus sampling (0.0-1.0). Alternative to temperature for controlling randomness. Default: model default
- Max Tokens - Maximum number of tokens to generate. Default: model default
- Stop Sequences - Comma-separated sequences where generation will stop (e.g., "END,STOP").
Reasoning Mode
- Reasoning Mode - Enable reasoning/thinking mode for compatible models (Claude, o-series, Gemini, Grok):
- Off - No reasoning mode (default)
- Low - Minimal reasoning effort
- Medium - Moderate reasoning effort
- High - Maximum reasoning effort
Structured Output
- Response Schema - JSON schema for structured output. Requires JSON Mode to be enabled.
Advanced
- Seed - Random seed for reproducible outputs.
- Timeout (seconds) - Request timeout in seconds. Default: 120
- Include Raw Response - Include full API response in output. Default: false
Outputs
- Text - Generated response text. Returns a string if single generation, or an array of strings if multiple generations.
- Chat History - Updated conversation history including the new exchange. Excludes system messages for cleaner history management.
- Raw Response - Complete API response object (only when "Include Raw Response" is enabled).
How It Works
When executed, the node:
- Validates the connection or creates a temporary client using provided credentials
- Prepares the system prompt (defaults to "You are a helpful assistant" if empty)
- Validates that the user prompt is not empty
- Processes the input Chat History array (if provided)
- Builds the complete messages array:
- Starts with the system message
- Adds all history messages (maintains conversation context)
- Appends the new user message
- Configures all generation parameters (streaming, JSON mode, reasoning, etc.)
- Makes the API request
- For non-streaming: Extracts text from response
- For streaming: Reads tokens in real-time and assembles complete response
- Updates the Chat History with the assistant's response
- Returns the text, updated history, and optional raw response
Chat History Format
The Chat History input and output use this format:
[
{
role: "user",
content: "What is RPA?"
},
{
role: "assistant",
content: "RPA stands for Robotic Process Automation..."
},
{
role: "user",
content: "How does it work?"
},
{
role: "assistant",
content: "RPA works by automating repetitive tasks..."
}
]
Notes:
- System messages are automatically added and excluded from output history
- Only
userandassistantroles appear in the output history - History is cumulative - each response adds to it
Requirements
- Either a valid Connection Id from Connect node OR direct API Key credentials
- Non-empty User Prompt
Error Handling
The node will return specific errors in the following cases:
- Empty or missing User Prompt
- Invalid Connection Id (when not using direct credentials)
- Empty Custom Model name when Custom Model is selected
- Invalid Chat History format
- API authentication errors (401)
- API rate limit errors (429)
- Model not found errors (404)
- API service errors (500, 502, 503, 504)
- Request timeout errors
- Streaming connection errors
Usage Notes
Conversation Management
- Chat History maintains context across multiple turns
- Feed the Chat History output back as input for the next turn
- System prompt is applied to every turn but not included in history output
- History grows with each turn - consider truncating for very long conversations
Streaming vs Non-Streaming
- Non-streaming: Waits for complete response, then returns all text at once
- Streaming: Receives tokens in real-time, assembles complete text before returning
- Streaming provides better user experience but currently returns complete text (not progressive)
Context Window Limits
- Different models have different context window sizes
- Long conversations may exceed the context window
- Consider truncating old history or summarizing past conversations
- Monitor token usage to stay within limits
Multi-Turn Conversations
- Always pass the previous Chat History output to the next node
- Store history in message scope (e.g., msg.chat_history)
- Each turn builds upon previous context
Examples
Example 1: Simple Two-Turn Conversation
Turn 1:
- Connection Id: msg.connection
- User Prompt: "What is OpenRouter?"
- Chat History: (empty or not provided)
Output 1:
- Text: "OpenRouter is a unified API gateway that provides access to multiple AI models..."
- Chat History:
[{role: "user", content: "What is OpenRouter?"}, {role: "assistant", content: "OpenRouter is a unified API gateway..."}]
Turn 2:
- Connection Id: msg.connection
- User Prompt: "What models does it support?"
- Chat History: (output from Turn 1)
Output 2:
- Text: "Based on our previous discussion about OpenRouter, it supports over 480+ models including..."
- Chat History: (now has 2 exchanges)
Example 2: Customer Support Chatbot
System Prompt: "You are a helpful customer support agent for an e-commerce company. Be friendly and helpful."
Turn 1:
- User Prompt: "I haven't received my order yet"
- Chat History: []
Turn 2:
- User Prompt: "Order number is #12345"
- Chat History: (from Turn 1)
Turn 3:
- User Prompt: "When will it arrive?"
- Chat History: (from Turn 2)
The model maintains context throughout, remembering the order number and issue.
Example 3: Building a Loop-Based Chatbot
// Initialize
msg.chat_history = [];
msg.user_message = "Hello, I need help with automation";
// In a loop (could be triggered by user input):
while (msg.user_message !== "bye") {
// Chat Completion Node
// Input: msg.user_message, msg.chat_history
// Output: msg.response, msg.chat_history
// Display response to user
console.log(msg.response);
// Get next user input
msg.user_message = getUserInput();
}
This pattern creates a continuous conversation loop.
Example 4: Multi-Language Support
System Prompt: "You are a multilingual assistant. Respond in the same language the user uses."
Turn 1:
- User Prompt: "Hola, ¿cómo estás?"
- Output: "¡Hola! Estoy bien, gracias. ¿En qué puedo ayudarte hoy?"
Turn 2:
- User Prompt: "Can you switch to English?"
- Output: "Of course! I'm happy to continue in English. How can I help you?"
The model adapts to language changes while maintaining context.
Example 5: JSON Mode with Conversation
System Prompt: "Extract key information from the conversation and return it as JSON."
JSON Mode: true Response Schema:
{
"type": "object",
"properties": {
"user_intent": {"type": "string"},
"entities": {"type": "array", "items": {"type": "string"}},
"sentiment": {"type": "string"}
}
}
Turn 1:
- User Prompt: "I want to book a flight to Paris next week"
Output:
{
"user_intent": "book_flight",
"entities": ["Paris", "next week"],
"sentiment": "neutral"
}
Useful for extracting structured data from conversations.
Example 6: Streaming Chat for Responsive UI
Configuration:
- Stream: true
- Model: GPT-4.1
Usage: Although streaming assembles the complete text before returning, it's useful for:
- Long responses that might timeout without streaming
- Better server-side handling of large outputs
- Future progressive output capabilities
Example 7: Conversation History Management
To prevent context window overflow:
// Keep only last 10 messages
if (msg.chat_history.length > 10) {
msg.chat_history = msg.chat_history.slice(-10);
}
// Or summarize old messages
if (msg.chat_history.length > 20) {
// Use a separate Generate Text node to summarize
// Keep summary + recent messages
}
Example 8: Multi-User Chat Management
For handling multiple users:
// Store separate histories per user
msg.user_histories = msg.user_histories || {};
msg.current_user_id = "user123";
// Get user's history
msg.chat_history = msg.user_histories[msg.current_user_id] || [];
// After Chat Completion
msg.user_histories[msg.current_user_id] = msg.chat_history;
Best Practices
-
History Management:
- Store Chat History in message scope for easy access
- Always pass previous history to maintain context
- Truncate very long histories to avoid context window limits
- Consider summarizing old conversations for very long sessions
-
System Prompt Design:
- Use System Prompt to set the assistant's personality and role
- Keep system prompts concise but clear
- Update system prompts for different conversation phases if needed
-
Model Selection:
- Use Gemini 2.5 Flash for fast, cost-effective chat
- Use Claude Sonnet 4.5 for high-quality conversations with long context
- Use GPT models for consistency and reliability
- Test different models for your specific use case
-
Error Handling:
- Handle network errors and timeouts gracefully
- Implement retry logic for failed requests
- Provide fallback responses when API is unavailable
- Log errors with conversation context for debugging
-
Performance Optimization:
- Use streaming for long responses
- Set appropriate Max Tokens to control response length
- Consider caching common responses
- Monitor token usage to manage costs
-
Conversation Flow:
- Design clear conversation paths
- Handle off-topic user inputs gracefully
- Implement conversation reset functionality
- Provide exit commands for users
-
Multi-Turn Context:
- Use conversation history to avoid repeating information
- Reference previous messages naturally
- Maintain consistent personality across turns
- Handle topic changes smoothly
-
Production Deployment:
- Implement proper user session management
- Store conversation histories in databases for persistence
- Add conversation analytics and monitoring
- Implement rate limiting per user
- Add content filtering for safety