Split CSV
Splits a CSV file into smaller files based on the maximum number of rows specified.
Common Properties
- Name - The custom name of the node.
- Color - The custom color of the node.
- Delay Before (sec) - Waits in seconds before executing the node.
- Delay After (sec) - Waits in seconds after executing node.
- Continue On Error - Automation will continue regardless of any error. The default value is false.
If ContinueOnError property is true, no error is caught when the project is executed even if Catch node is used.
Input
- File Path - Path of the CSV file to split.
- Max Rows - The maximum number of rows that each split file should contain.
Output
- Out Directory - Path of the directory where the split files are saved.
Options
- Separator - The separator used in the CSV file. Comma (,) or Semicolon (;).
- Headers - Specify whether the CSV includes headers.
Example
Suppose you have a large CSV file called data.csv that contains 1000 rows and you want to split it into smaller files that contain a maximum of 100 rows each. You would use the following configuration:
- File Path -
"C:\myfolder\data.csv" - Max Rows -
100 - Out Directory -
"C:\myfolder\out" - Separator -
, - Headers -
true
After the node has executed, you will have 10 split files in the "C:\myfolder\out" directory, each containing 100 rows, except for the last file which will contain the remaining rows. The split files will be named data-000.csv, data-001.csv, ..., data-009.csv.
How It Works
The Split CSV node performs the following steps:
- Validates Input File - Checks if the source CSV file exists at the specified path
- Creates Output Directory - Creates the output directory if it doesn't already exist
- Parses File Structure - Reads the CSV file to understand its structure and total row count
- Preserves Headers - If Headers option is enabled, captures the header row to include in each split file
- Calculates Splits - Determines how many output files will be created based on Max Rows setting
- Splits Data - Divides the data into chunks, each containing up to Max Rows
- Writes Split Files - Creates numbered CSV files (e.g., filename-000.csv, filename-001.csv) in the output directory
- Returns Directory Path - Outputs the path to the directory containing all split files
Requirements
- File System Access - Read permissions for source file and write permissions for output directory
- Valid CSV File - The source file must be a properly formatted CSV file
- Disk Space - Sufficient disk space to store all split files (approximately the same size as original file)
- Max Rows Setting - Max Rows must be a positive integer greater than 0
- Separator Match - The separator setting must match the actual delimiter in the source file
Error Handling
| Error Code | Description | Solution |
|---|---|---|
| FILE_NOT_FOUND | The source CSV file does not exist | Verify the file path is correct |
| PERMISSION_DENIED | Insufficient permissions to read source or write to output directory | Check file and directory permissions |
| INVALID_MAX_ROWS | Max Rows is not a valid positive number | Ensure Max Rows is set to a positive integer |
| DISK_FULL | Insufficient disk space for split files | Free up disk space before splitting |
| INVALID_SEPARATOR | The separator doesn't match the file structure | Verify the correct delimiter used in the CSV |
| OUTPUT_DIR_ERROR | Unable to create or write to output directory | Ensure the output directory path is valid and writable |
| MALFORMED_CSV | The CSV file has corrupted or inconsistent data | Validate and fix the source CSV file structure |
Usage Examples
Example 1: Split Large Customer Database
Split a large customer database into manageable chunks for processing:
Input:
- File Path: "C:\Database\customers.csv"
- Max Rows: 500
- Separator: Comma (,)
- Headers: true
Result:
- Out Directory: "C:\Database\customers_split\"
- Files created: customers-000.csv (500 rows + header),
customers-001.csv (500 rows + header),
customers-002.csv (remaining rows + header)
Example 2: Split for Parallel Processing
Split a large dataset for parallel processing across multiple automation instances:
Input:
- File Path: "/data/transactions.csv"
- Max Rows: 1000
- Separator: Semicolon (;)
- Headers: true
Result:
- Out Directory: "/data/transactions_split/"
- Each split file can be processed independently by different automation instances
Example 3: Split for Email Attachment Size Limits
Split a large report to comply with email attachment size limits:
Input:
- File Path: "D:\Reports\annual_report.csv"
- Max Rows: 2000
- Separator: Comma (,)
- Headers: true
Result:
- Out Directory: "D:\Reports\annual_report_split\"
- Smaller files that can be attached to emails without exceeding size limits
Usage Notes
- Header Inclusion - When Headers is true, each split file includes the header row in addition to the Max Rows of data.
- Automatic Numbering - Split files are automatically numbered with zero-padded indices (000, 001, 002, etc.).
- Original File Preserved - The original CSV file remains unchanged after splitting.
- Output Directory Creation - If the output directory doesn't exist, it will be created automatically.
- Last File Size - The last split file may contain fewer rows than Max Rows if the total rows aren't evenly divisible.
- Performance - Splitting very large files (over 1GB) may take considerable time. Monitor progress and plan accordingly.
- File Naming - Split files use the original filename with a numeric suffix (e.g., data.csv becomes data-000.csv).
Tips
- Optimal Chunk Size - Choose Max Rows based on your processing needs. For parallel processing, consider your available CPU cores.
- Memory Management - Splitting is more memory-efficient than loading entire large files for processing.
- Batch Processing Pattern - After splitting, use a Loop node to process each split file individually.
- Cleanup Strategy - Plan to clean up split files after processing to avoid cluttering storage.
- Test First - Test splitting with a smaller Max Rows value first to ensure correct configuration.
- Progress Tracking - When processing split files, log which file is being processed for debugging and monitoring.
- Parallel Execution - Split CSV is ideal when you need to distribute workload across multiple automation instances or servers.
Related Nodes
- Read CSV - Read individual split files for processing
- Write CSV - Merge processed split files back together if needed
- Append CSV - Combine processed split files into a single output
-
- Iterate through split files for batch processing
- Data Table - Work with data from split files