Skip to main content

Generate Completion

Generates text completions using a local Ollama AI model. This node is ideal for single-prompt text generation tasks such as content creation, summarization, translation, and general text processing.

info

For conversational AI with context and message history, use Generate Chat Completion instead.

Common Properties

  • Name - The custom name of the node.
  • Color - The custom color of the node.
  • Delay Before (sec) - Waits in seconds before executing the node.
  • Delay After (sec) - Waits in seconds after executing node.
  • Continue On Error - Automation will continue regardless of any error. The default value is false.
info

If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.

Inputs

  • Client ID - The Client ID from the Connect node. Optional if Host URL is provided.
  • Model - The name of the Ollama model to use (e.g., llama3, mistral, codellama, gemma).
  • Prompt - The text prompt to send to the model. This is the instruction or question you want the AI to respond to.

Output

  • Response - The generated text response from the AI model.

Options

  • Options - A JSON object containing model parameters to control generation behavior:
    • temperature (0.0-2.0) - Controls randomness. Lower values (0.1-0.5) make output more focused and deterministic. Higher values (0.8-2.0) make output more creative and varied. Default: 0.8
    • top_p (0.0-1.0) - Nucleus sampling. Controls diversity. Lower values make output more focused. Default: 0.9
    • top_k (integer) - Limits vocabulary to top K tokens. Lower values make output more focused.
    • num_predict (integer) - Maximum number of tokens to generate. Default: 128, -1 for unlimited
    • repeat_penalty (0.0-2.0) - Penalty for repeating tokens. Higher values reduce repetition. Default: 1.1
    • seed (integer) - Random seed for reproducible outputs
    • num_ctx (integer) - Context window size in tokens. Default: 2048
  • Host URL - Ollama server URL (optional). Use this instead of Client ID for direct connection. Example: http://localhost:11434

How It Works

The Generate Completion node:

  1. Connects to the Ollama server (via Client ID or Host URL)
  2. Sends your prompt to the specified model
  3. Receives the generated text in a streaming fashion
  4. Concatenates all response chunks
  5. Returns the complete generated text

Usage Examples

Example 1: Simple Text Generation

Inputs:
- Model: "llama3"
- Prompt: "Write a professional email subject line for a project update"

Output:
- Response: "Project Update: Q4 Milestones Achieved and Next Steps"

Example 2: Content Summarization

Inputs:
- Model: "mistral"
- Prompt: "Summarize this text in 3 bullet points: [long article text here]"

Output:
- Response: "
• Main point about topic A
• Key insight about topic B
• Conclusion about topic C
"

Example 3: Data Extraction

Inputs:
- Model: "llama3"
- Prompt: "Extract the email address from this text: 'Contact John at john.doe@example.com for more info'"

Output:
- Response: "john.doe@example.com"

Example 4: Code Generation

Inputs:
- Model: "codellama"
- Prompt: "Write a Python function to calculate factorial"

Output:
- Response: "
def factorial(n):
if n == 0 or n == 1:
return 1
return n * factorial(n - 1)
"

Example 5: Using Options for Controlled Output

Inputs:
- Model: "llama3"
- Prompt: "Generate a product description for a wireless mouse"
- Options: {
"temperature": 0.3,
"num_predict": 150,
"repeat_penalty": 1.2
}

Output:
- Response: "This ergonomic wireless mouse features precise optical tracking..."

Requirements

  • Ollama service must be running
  • The specified model must be pulled locally (use Pull Model)
  • Valid Client ID from Connect node OR Host URL provided

Common Use Cases

  • Automated content generation for reports and emails
  • Text summarization and condensation
  • Language translation
  • Data extraction from unstructured text
  • Code generation and completion
  • Text classification and categorization
  • Question answering
  • Creative writing assistance

Tips

Choosing the Right Model

  • llama3 - Best for general-purpose text generation, good balance of quality and speed
  • mistral - Excellent for instruction following and structured outputs
  • codellama - Optimized for code generation and programming tasks
  • gemma - Fast and efficient for simpler tasks
  • phi - Lightweight model for quick responses

Optimizing Temperature

  • 0.1-0.3 - Factual, consistent, deterministic (ideal for data extraction)
  • 0.5-0.7 - Balanced creativity and consistency (good for general use)
  • 0.8-1.2 - Creative and varied (good for content generation)
  • 1.5-2.0 - Highly creative and unpredictable (experimental use)

Prompt Engineering Best Practices

  • Be specific and clear in your instructions
  • Provide examples in your prompt when possible
  • Use system-style prompts: "You are an expert in..."
  • Break complex tasks into smaller, focused prompts
  • Include output format instructions: "Respond in JSON format"

Performance Optimization

  • Use num_predict to limit response length and speed up generation
  • Smaller models respond faster but may be less accurate
  • Cache frequently used prompts and responses
  • Consider using seed parameter for reproducible results in testing

Error Handling

Common errors you might encounter:

  • "Model name cannot be empty" - Provide a valid model name
  • "Prompt cannot be empty" - Ensure your prompt input is not empty
  • "Either Host URL or Client ID must be provided" - Provide one connection method
  • "Failed to create client" - Verify Ollama service is running
  • Model not found - Pull the model using Pull Model node

Advanced Options Object Example

{
"temperature": 0.7,
"top_p": 0.9,
"top_k": 40,
"num_predict": 500,
"repeat_penalty": 1.1,
"seed": 42,
"num_ctx": 4096
}