Create Batch
Creates a new batch transaction for executing multiple SQL commands atomically.
Common Properties
- Name - The custom name of the node.
- Color - The custom color of the node.
- Delay Before (sec) - Waits in seconds before executing the node.
- Delay After (sec) - Waits in seconds after executing node.
- Continue On Error - Automation will continue regardless of any error. The default value is false.
If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.
Inputs
- Connection Id - The connection ID returned from the Connect node (optional if credentials are provided).
Options
- Credentials - Database credentials (optional if using Connection ID). This allows creating batches without a Connect node.
Output
- Batch Id - A unique identifier for the batch transaction.
How It Works
The Create Batch node starts a new transaction on the ClickHouse database. When executed, the node:
- Validates inputs and determines connection method (connection ID or credentials)
- Begins a new database transaction on the connection
- Generates a unique batch ID
- Stores the transaction in the internal batch pool
- Returns the batch ID for use in subsequent Execute Query/Non Query operations
- Transaction remains open until Send Batch is called
Requirements
- Either a valid connection ID from Connect node OR database credentials
- The connection must be active and not already have an uncommitted transaction
Error Handling
The node will return specific errors in the following cases:
- ErrInvalidArg - Neither connection ID nor credentials were provided
- ErrNotFound - Connection ID not found
- Database transaction errors (connection lost, server issues)
Usage Notes
- The batch ID should be passed to Execute Query and Execute Non Query nodes
- All operations using the same batch ID will be part of the same transaction
- Transactions must be committed using the Send Batch node
- If an error occurs before Send Batch, the transaction is automatically rolled back
- Multiple batches can be active simultaneously on different connections
- Batch transactions ensure atomicity - all operations succeed or all fail
Transaction Scope
What happens in a batch transaction:
- All SQL commands are queued
- No changes are visible to other connections
- Changes are isolated until commit
- Commands can be queries or non-queries
- Transaction can be rolled back before commit
What is NOT included:
- Operations outside the batch ID scope
- DDL operations (may commit immediately depending on ClickHouse settings)
Example 1: Basic Batch Transaction
Scenario: Insert multiple records as an atomic operation
Flow:
- Connect to database
- Create Batch → Get batch ID
- Execute Non Query with batch ID → Insert record 1
- Execute Non Query with batch ID → Insert record 2
- Execute Non Query with batch ID → Insert record 3
- Send Batch → Commit all inserts
- Disconnect
Inputs:
- Connection Id: (from Connect node)
Output:
Batch Id: "7c9e6679-7425-40de-944b-e07fc1f90ae7"
Example 2: Transfer Data Between Tables
Scenario: Move data from a staging table to production table atomically
Create Batch Inputs:
- Connection Id: (from Connect node)
Execute Non Query 1 (with batch_id):
INSERT INTO production_table
SELECT * FROM staging_table
WHERE processed = false
Execute Non Query 2 (with batch_id):
ALTER TABLE staging_table
UPDATE processed = true
WHERE processed = false
Send Batch:
- Batch Id: (from Create Batch node)
Result: Data is moved and marked as processed in a single atomic transaction.
Example 3: Complex Analytics Update
Scenario: Update multiple aggregation tables consistently
Create Batch Inputs:
- Connection Id: (from Connect node)
Execute Non Query 1 (with batch_id):
INSERT INTO daily_metrics
SELECT
toDate(timestamp) as date,
count() as events,
uniq(user_id) as unique_users
FROM raw_events
WHERE toDate(timestamp) = today()
GROUP BY date
Execute Non Query 2 (with batch_id):
ALTER TABLE raw_events
DELETE WHERE toDate(timestamp) = today()
Execute Non Query 3 (with batch_id):
INSERT INTO audit_log (timestamp, action, records_processed)
SELECT now(), 'daily_aggregation', count()
FROM raw_events
WHERE toDate(timestamp) = today()
Send Batch:
- Batch Id: (from Create Batch node)
Result: Metrics are calculated, raw data is cleaned, and audit log is updated atomically.
Example 4: Using Credentials (No Connection)
Scenario: Execute a batch transaction without maintaining a persistent connection
Create Batch Inputs:
- Connection Id: (empty)
Options:
- Credentials: (select ClickHouse Database credential)
Output:
Batch Id: "8d1f7890-8536-41ef-955c-f18gd2g01bf8"
Then use this batch ID with Execute Query/Non Query nodes (also using credentials).
Best Practices
- Atomic Operations: Group related operations in a single batch
- Error Handling: Implement error handling to ensure batches are either committed or rolled back
- Batch Size: Keep transactions reasonably sized - very large transactions can lock resources
- Commit Frequency: Commit batches promptly to avoid holding locks
- Rollback Strategy: Plan for rollback scenarios in case of errors
- Resource Management: Always pair Create Batch with Send Batch
- Connection Reuse: Reuse connections for multiple batches rather than creating new connections
Transaction Pattern
A typical batch transaction flow:
┌─────────────────┐
│ Connect │
└────────┬────────┘
│
┌────────▼────────┐
│ Create Batch │ ← Returns batch_id
└────────┬────────┘
│
┌────────▼────────┐
│ Execute Non │ ← Uses batch_id
│ Query (Insert) │
└────────┬────────┘
│
┌────────▼────────┐
│ Execute Non │ ← Uses batch_id
│ Query (Update) │
└────────┬────────┘
│
┌────────▼────────┐
│ Send Batch │ ← Commits transaction
└────────┬────────┘
│
┌────────▼────────┐
│ Disconnect │
└─────────────────┘
Error-Safe Pattern with Try-Catch
┌─────────────────┐
│ Connect │
└────────┬────────┘
│
┌────────▼────────┐
│ Create Batch │
└────────┬────────┘
│
┌────▼────┐
│ Try │
└────┬────┘
│
[Execute operations with batch_id]
│
┌────▼────────┐
│ Send Batch │ ← Commit on success
└─────────────┘
│
┌────▼────┐
│ Catch │
└────┬────┘
│
[Handle error - batch auto-rollback]
│
┌────────▼────────┐
│ Disconnect │
└─────────────────┘
Performance Considerations
- Batch commits are more efficient than individual commits for multiple operations
- ClickHouse optimizes batch inserts by reducing merge operations
- Large batches may increase memory usage - balance between efficiency and resources
- Transaction isolation level ensures consistency but may impact concurrent queries
- Avoid long-running transactions to prevent blocking other operations
Common Use Cases
- Data Migration: Move data between tables atomically
- ETL Operations: Extract, transform, and load data with consistency guarantees
- Multi-table Updates: Keep related tables synchronized
- Audit Logging: Record operations and their data changes together
- Rollback Safety: Test operations before committing
- Bulk Operations: Efficiently insert or update large datasets