Skip to main content

Create Batch

Creates a new batch transaction for executing multiple SQL commands atomically.

Common Properties

  • Name - The custom name of the node.
  • Color - The custom color of the node.
  • Delay Before (sec) - Waits in seconds before executing the node.
  • Delay After (sec) - Waits in seconds after executing node.
  • Continue On Error - Automation will continue regardless of any error. The default value is false.
info

If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.

Inputs

  • Connection Id - The connection ID returned from the Connect node (optional if credentials are provided).

Options

  • Credentials - Database credentials (optional if using Connection ID). This allows creating batches without a Connect node.

Output

  • Batch Id - A unique identifier for the batch transaction.

How It Works

The Create Batch node starts a new transaction on the ClickHouse database. When executed, the node:

  1. Validates inputs and determines connection method (connection ID or credentials)
  2. Begins a new database transaction on the connection
  3. Generates a unique batch ID
  4. Stores the transaction in the internal batch pool
  5. Returns the batch ID for use in subsequent Execute Query/Non Query operations
  6. Transaction remains open until Send Batch is called

Requirements

  • Either a valid connection ID from Connect node OR database credentials
  • The connection must be active and not already have an uncommitted transaction

Error Handling

The node will return specific errors in the following cases:

  • ErrInvalidArg - Neither connection ID nor credentials were provided
  • ErrNotFound - Connection ID not found
  • Database transaction errors (connection lost, server issues)

Usage Notes

  • The batch ID should be passed to Execute Query and Execute Non Query nodes
  • All operations using the same batch ID will be part of the same transaction
  • Transactions must be committed using the Send Batch node
  • If an error occurs before Send Batch, the transaction is automatically rolled back
  • Multiple batches can be active simultaneously on different connections
  • Batch transactions ensure atomicity - all operations succeed or all fail

Transaction Scope

What happens in a batch transaction:

  • All SQL commands are queued
  • No changes are visible to other connections
  • Changes are isolated until commit
  • Commands can be queries or non-queries
  • Transaction can be rolled back before commit

What is NOT included:

  • Operations outside the batch ID scope
  • DDL operations (may commit immediately depending on ClickHouse settings)

Example 1: Basic Batch Transaction

Scenario: Insert multiple records as an atomic operation

Flow:

  1. Connect to database
  2. Create Batch → Get batch ID
  3. Execute Non Query with batch ID → Insert record 1
  4. Execute Non Query with batch ID → Insert record 2
  5. Execute Non Query with batch ID → Insert record 3
  6. Send Batch → Commit all inserts
  7. Disconnect

Inputs:

  • Connection Id: (from Connect node)

Output:

Batch Id: "7c9e6679-7425-40de-944b-e07fc1f90ae7"

Example 2: Transfer Data Between Tables

Scenario: Move data from a staging table to production table atomically

Create Batch Inputs:

  • Connection Id: (from Connect node)

Execute Non Query 1 (with batch_id):

INSERT INTO production_table
SELECT * FROM staging_table
WHERE processed = false

Execute Non Query 2 (with batch_id):

ALTER TABLE staging_table
UPDATE processed = true
WHERE processed = false

Send Batch:

  • Batch Id: (from Create Batch node)

Result: Data is moved and marked as processed in a single atomic transaction.

Example 3: Complex Analytics Update

Scenario: Update multiple aggregation tables consistently

Create Batch Inputs:

  • Connection Id: (from Connect node)

Execute Non Query 1 (with batch_id):

INSERT INTO daily_metrics
SELECT
toDate(timestamp) as date,
count() as events,
uniq(user_id) as unique_users
FROM raw_events
WHERE toDate(timestamp) = today()
GROUP BY date

Execute Non Query 2 (with batch_id):

ALTER TABLE raw_events
DELETE WHERE toDate(timestamp) = today()

Execute Non Query 3 (with batch_id):

INSERT INTO audit_log (timestamp, action, records_processed)
SELECT now(), 'daily_aggregation', count()
FROM raw_events
WHERE toDate(timestamp) = today()

Send Batch:

  • Batch Id: (from Create Batch node)

Result: Metrics are calculated, raw data is cleaned, and audit log is updated atomically.

Example 4: Using Credentials (No Connection)

Scenario: Execute a batch transaction without maintaining a persistent connection

Create Batch Inputs:

  • Connection Id: (empty)

Options:

  • Credentials: (select ClickHouse Database credential)

Output:

Batch Id: "8d1f7890-8536-41ef-955c-f18gd2g01bf8"

Then use this batch ID with Execute Query/Non Query nodes (also using credentials).

Best Practices

  • Atomic Operations: Group related operations in a single batch
  • Error Handling: Implement error handling to ensure batches are either committed or rolled back
  • Batch Size: Keep transactions reasonably sized - very large transactions can lock resources
  • Commit Frequency: Commit batches promptly to avoid holding locks
  • Rollback Strategy: Plan for rollback scenarios in case of errors
  • Resource Management: Always pair Create Batch with Send Batch
  • Connection Reuse: Reuse connections for multiple batches rather than creating new connections

Transaction Pattern

A typical batch transaction flow:

┌─────────────────┐
│ Connect │
└────────┬────────┘

┌────────▼────────┐
│ Create Batch │ ← Returns batch_id
└────────┬────────┘

┌────────▼────────┐
│ Execute Non │ ← Uses batch_id
│ Query (Insert) │
└────────┬────────┘

┌────────▼────────┐
│ Execute Non │ ← Uses batch_id
│ Query (Update) │
└────────┬────────┘

┌────────▼────────┐
│ Send Batch │ ← Commits transaction
└────────┬────────┘

┌────────▼────────┐
│ Disconnect │
└─────────────────┘

Error-Safe Pattern with Try-Catch

┌─────────────────┐
│ Connect │
└────────┬────────┘

┌────────▼────────┐
│ Create Batch │
└────────┬────────┘

┌────▼────┐
│ Try │
└────┬────┘

[Execute operations with batch_id]

┌────▼────────┐
│ Send Batch │ ← Commit on success
└─────────────┘

┌────▼────┐
│ Catch │
└────┬────┘

[Handle error - batch auto-rollback]

┌────────▼────────┐
│ Disconnect │
└─────────────────┘

Performance Considerations

  • Batch commits are more efficient than individual commits for multiple operations
  • ClickHouse optimizes batch inserts by reducing merge operations
  • Large batches may increase memory usage - balance between efficiency and resources
  • Transaction isolation level ensures consistency but may impact concurrent queries
  • Avoid long-running transactions to prevent blocking other operations

Common Use Cases

  • Data Migration: Move data between tables atomically
  • ETL Operations: Extract, transform, and load data with consistency guarantees
  • Multi-table Updates: Keep related tables synchronized
  • Audit Logging: Record operations and their data changes together
  • Rollback Safety: Test operations before committing
  • Bulk Operations: Efficiently insert or update large datasets