Skip to main content

Split CSV

Splits a CSV file into smaller files based on the maximum number of rows specified.

Common Properties

  • Name - The custom name of the node.
  • Color - The custom color of the node.
  • Delay Before (sec) - Waits in seconds before executing the node.
  • Delay After (sec) - Waits in seconds after executing node.
  • Continue On Error - Automation will continue regardless of any error. The default value is false.
info

If ContinueOnError property is true, no error is caught when the project is executed even if Catch node is used.

Input

  • File Path - Path of the CSV file to split.
  • Max Rows - The maximum number of rows that each split file should contain.

Output

  • Out Directory - Path of the directory where the split files are saved.

Options

  • Separator - The separator used in the CSV file. Comma (,) or Semicolon (;).
  • Headers - Specify whether the CSV includes headers.

Example

Suppose you have a large CSV file called data.csv that contains 1000 rows and you want to split it into smaller files that contain a maximum of 100 rows each. You would use the following configuration:

  • File Path - "C:\myfolder\data.csv"
  • Max Rows - 100
  • Out Directory - "C:\myfolder\out"
  • Separator - ,
  • Headers - true

After the node has executed, you will have 10 split files in the "C:\myfolder\out" directory, each containing 100 rows, except for the last file which will contain the remaining rows. The split files will be named data-000.csv, data-001.csv, ..., data-009.csv.

How It Works

The Split CSV node performs the following steps:

  1. Validates Input File - Checks if the source CSV file exists at the specified path
  2. Creates Output Directory - Creates the output directory if it doesn't already exist
  3. Parses File Structure - Reads the CSV file to understand its structure and total row count
  4. Preserves Headers - If Headers option is enabled, captures the header row to include in each split file
  5. Calculates Splits - Determines how many output files will be created based on Max Rows setting
  6. Splits Data - Divides the data into chunks, each containing up to Max Rows
  7. Writes Split Files - Creates numbered CSV files (e.g., filename-000.csv, filename-001.csv) in the output directory
  8. Returns Directory Path - Outputs the path to the directory containing all split files

Requirements

  • File System Access - Read permissions for source file and write permissions for output directory
  • Valid CSV File - The source file must be a properly formatted CSV file
  • Disk Space - Sufficient disk space to store all split files (approximately the same size as original file)
  • Max Rows Setting - Max Rows must be a positive integer greater than 0
  • Separator Match - The separator setting must match the actual delimiter in the source file

Error Handling

Error CodeDescriptionSolution
FILE_NOT_FOUNDThe source CSV file does not existVerify the file path is correct
PERMISSION_DENIEDInsufficient permissions to read source or write to output directoryCheck file and directory permissions
INVALID_MAX_ROWSMax Rows is not a valid positive numberEnsure Max Rows is set to a positive integer
DISK_FULLInsufficient disk space for split filesFree up disk space before splitting
INVALID_SEPARATORThe separator doesn't match the file structureVerify the correct delimiter used in the CSV
OUTPUT_DIR_ERRORUnable to create or write to output directoryEnsure the output directory path is valid and writable
MALFORMED_CSVThe CSV file has corrupted or inconsistent dataValidate and fix the source CSV file structure

Usage Examples

Example 1: Split Large Customer Database

Split a large customer database into manageable chunks for processing:

Input:
- File Path: "C:\Database\customers.csv"
- Max Rows: 500
- Separator: Comma (,)
- Headers: true

Result:
- Out Directory: "C:\Database\customers_split\"
- Files created: customers-000.csv (500 rows + header),
customers-001.csv (500 rows + header),
customers-002.csv (remaining rows + header)

Example 2: Split for Parallel Processing

Split a large dataset for parallel processing across multiple automation instances:

Input:
- File Path: "/data/transactions.csv"
- Max Rows: 1000
- Separator: Semicolon (;)
- Headers: true

Result:
- Out Directory: "/data/transactions_split/"
- Each split file can be processed independently by different automation instances

Example 3: Split for Email Attachment Size Limits

Split a large report to comply with email attachment size limits:

Input:
- File Path: "D:\Reports\annual_report.csv"
- Max Rows: 2000
- Separator: Comma (,)
- Headers: true

Result:
- Out Directory: "D:\Reports\annual_report_split\"
- Smaller files that can be attached to emails without exceeding size limits

Usage Notes

  • Header Inclusion - When Headers is true, each split file includes the header row in addition to the Max Rows of data.
  • Automatic Numbering - Split files are automatically numbered with zero-padded indices (000, 001, 002, etc.).
  • Original File Preserved - The original CSV file remains unchanged after splitting.
  • Output Directory Creation - If the output directory doesn't exist, it will be created automatically.
  • Last File Size - The last split file may contain fewer rows than Max Rows if the total rows aren't evenly divisible.
  • Performance - Splitting very large files (over 1GB) may take considerable time. Monitor progress and plan accordingly.
  • File Naming - Split files use the original filename with a numeric suffix (e.g., data.csv becomes data-000.csv).

Tips

  • Optimal Chunk Size - Choose Max Rows based on your processing needs. For parallel processing, consider your available CPU cores.
  • Memory Management - Splitting is more memory-efficient than loading entire large files for processing.
  • Batch Processing Pattern - After splitting, use a Loop node to process each split file individually.
  • Cleanup Strategy - Plan to clean up split files after processing to avoid cluttering storage.
  • Test First - Test splitting with a smaller Max Rows value first to ensure correct configuration.
  • Progress Tracking - When processing split files, log which file is being processed for debugging and monitoring.
  • Parallel Execution - Split CSV is ideal when you need to distribute workload across multiple automation instances or servers.
  • Read CSV - Read individual split files for processing
  • Write CSV - Merge processed split files back together if needed
  • Append CSV - Combine processed split files into a single output
    • Iterate through split files for batch processing
  • Data Table - Work with data from split files