Skip to main content

Optimize

Optimizes a PDF file to reduce its file size by removing redundant data, compressing content, and streamlining the document structure.

Common Properties

  • Name - The custom name of the node.
  • Color - The custom color of the node.
  • Delay Before (sec) - Waits in seconds before executing the node.
  • Delay After (sec) - Waits in seconds after executing node.
  • Continue On Error - Automation will continue regardless of any error. The default value is false.
info

If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.

Inputs

  • PDF Path - Path to the PDF file to optimize.

Options

  • PDF Path to Save - Output path where the optimized PDF will be saved. If not provided, the original file is optimized in place.

Output

This node does not produce any output variables. The optimized PDF is saved to the specified path or overwrites the original.

How It Works

The Optimize node reduces PDF file size through various optimization techniques. When executed, the node:

  1. Validates the input PDF path
  2. Opens and analyzes the PDF structure
  3. Applies optimization techniques:
    • Removes duplicate objects and resources
    • Compresses embedded images
    • Removes unused or redundant data
    • Streamlines the internal document structure
    • Optimizes fonts and metadata
  4. Saves the optimized PDF to the output path or overwrites the original

Optimization Techniques Applied

  • Object Deduplication: Removes duplicate images, fonts, and other objects
  • Image Compression: Compresses embedded images without significant quality loss
  • Stream Compression: Compresses content streams for smaller file size
  • Metadata Cleanup: Removes unnecessary metadata
  • Font Subsetting: Optimizes embedded fonts
  • Structure Optimization: Streamlines the PDF internal structure

Use Cases

  • Email Attachments: Reduce file size to meet email attachment limits
  • Web Publishing: Optimize PDFs for faster web downloads
  • Archive Management: Reduce storage space for large PDF archives
  • Batch Processing: Optimize multiple PDFs to save disk space
  • Upload Requirements: Meet file size requirements for online portals
  • Performance: Speed up PDF loading and rendering times

Typical File Size Reductions

Optimization results vary based on the original PDF content:

  • Image-heavy PDFs: 30-70% reduction
  • Text-heavy PDFs: 10-30% reduction
  • Already optimized PDFs: 5-15% reduction
  • Scanned documents: 40-80% reduction

Example Workflows

Optimize Before Sending

  1. Generate or receive a PDF file
  2. Use Optimize node to reduce file size
  3. Send optimized PDF via email or upload

Batch Optimization

  1. Loop through a directory of PDF files
  2. Optimize each file
  3. Save optimized versions to a new directory

In-Place Optimization

  1. Use Optimize without specifying output path
  2. Original file is optimized and replaced
  3. Reduces storage requirements

Error Handling

The node will return specific errors in the following cases:

  • Empty or invalid PDF path
  • PDF file not found at the specified path
  • PDF file is corrupted or invalid
  • Insufficient permissions to read the PDF file
  • Insufficient permissions to write to output path or overwrite original
  • PDF file is currently open in another application
  • Disk space insufficient for optimization process

Usage Notes

  • Original PDF remains unchanged when an output path is specified
  • Without an output path, the original file is overwritten
  • Optimization is generally lossless but may reduce image quality slightly
  • Processing time increases with PDF size and complexity
  • Large PDFs may require significant memory during optimization
  • Encrypted PDFs should be decrypted first for better optimization
  • Multiple optimization passes typically don't yield further reductions
  • The node preserves PDF functionality (forms, links, annotations)

Tips for Effective Use

  • Backup First: When optimizing in place, backup important files first
  • Test Quality: Verify optimized PDFs maintain acceptable quality
  • Separate Output: Use output path to compare original vs optimized files
  • Batch Processing: Optimize multiple files in a loop for efficiency
  • Monitor Results: Track file size reduction to identify best candidates
  • Combine Operations: Optimize after merging multiple PDFs
  • Schedule Optimization: Run optimization during off-peak hours for large batches
  • Check Compatibility: Verify optimized PDFs work with target applications
  • Delete Originals: Remove original files after confirming optimized versions work

Before and After Comparison

Original PDF (5.2 MB)

  • High-resolution images
  • Embedded fonts
  • Metadata and comments
  • Duplicate resources

Optimized PDF (1.8 MB)

  • Compressed images
  • Optimized fonts
  • Cleaned metadata
  • Deduplicated resources
  • 65% size reduction

Performance Considerations

  • Small PDFs (under 1 MB): Processes in seconds
  • Medium PDFs (1-10 MB): Processes in 5-30 seconds
  • Large PDFs (over 10 MB): May take 1-5 minutes
  • Very Large PDFs (over 50 MB): Can take 5+ minutes
  • Merge: Optimize after merging multiple PDFs
  • Decrypt: Decrypt encrypted PDFs before optimization for better results
  • Split: Split large PDFs before optimization for faster processing