Skip to main content

Image To Text

Extracts text from images using Tesseract OCR (Optical Character Recognition). Supports both image file paths and base64-encoded image data with multi-language support.

Common Properties

  • Name - The custom name of the node.
  • Color - The custom color of the node.
  • Delay Before (sec) - Waits in seconds before executing the node.
  • Delay After (sec) - Waits in seconds after executing node.
  • Continue On Error - Automation will continue regardless of any error. The default value is false.

Inputs

  • Image Path - Path to the image file to extract text from (string).
  • Base64 Data - Base64 encoded image data to extract text from (string).
note

You must provide either Image Path or Base64 Data. At least one input is required.

Options

  • Language - Language to use for OCR text extraction (default: English). Select from over 100 supported languages including:

    • English (eng)
    • Spanish (spa)
    • French (fra)
    • German (deu)
    • Chinese Simplified (chi_sim)
    • Japanese (jpn)
    • Arabic (ara)
    • And many more...
  • Advanced Language - Enter multiple languages using language codes separated by plus signs (e.g., eng+ara for English and Arabic). This field overrides the Language option when provided. Find language codes at: https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html

Outputs

  • Text - The extracted text from the image (string).

How It Works

The Image To Text node extracts text from images using Tesseract OCR technology. When executed, the node:

  1. Validates that Tesseract is installed and available in the system PATH
  2. Determines which input source to use (Image Path or Base64 Data)
  3. For base64 input:
    • Decodes the base64 string to an image
    • Overlays the image on a white background for better OCR accuracy
    • Processes the image with Tesseract
  4. For file path input:
    • Reads the image file using OpenCV
    • Converts the image to grayscale for improved OCR performance
    • Saves a temporary processed image
    • Processes the image with Tesseract
    • Cleans up the temporary file
  5. Returns the extracted text with leading and trailing whitespace removed

Requirements

  • Tesseract OCR must be installed on the system and available in PATH
  • Valid image file (for Image Path input) in supported formats (PNG, JPG, BMP, TIFF, etc.)
  • Valid base64-encoded image data (for Base64 Data input)
  • Language data files must be installed for the selected language

Error Handling

The node will return specific errors in the following cases:

  • Tesseract not found - "Tesseract not found or not in PATH. You can install it from https://tesseract-ocr.github.io/tessdoc/Installation.html"
  • File not found - "No such file" (when the specified image path doesn't exist)
  • Empty inputs - "Both Image Path and Base64 Data can not be empty. You have to give at least one of them"
  • Missing language - "Language can not be empty. You should select the language or provide advanced language"

Usage Examples

Example 1: Extract Text from an Image File

// Extract text from a saved screenshot
$R.flow.imagePath = "/path/to/invoice.png";
// Language is set to English in the node options
// After execution, text will be available in $R.message.text

Example 2: Extract Text from Base64 Image

// Extract text from a base64 encoded image (e.g., from a web service)
$R.flow.imageData = "...";
// The node will decode and process the image
// Extracted text will be in $R.message.text

Example 3: Multi-Language Text Extraction

// Extract text from an image containing both English and Arabic
// Set Advanced Language option to: eng+ara
$R.flow.documentPath = "/path/to/multilingual-document.png";
// The OCR will recognize both English and Arabic text

Example 4: Processing Multiple Images in a Loop

// Get list of image files
const images = [
"/path/to/receipt1.jpg",
"/path/to/receipt2.jpg",
"/path/to/receipt3.jpg"
];

// Loop through each image
for (const imagePath of images) {
$R.flow.currentImage = imagePath;
// After Image To Text node executes:
// $R.message.text will contain the extracted text
// You can save it to a variable or process it further
}

Usage Notes

  • Image Quality: Higher resolution images with clear text produce better results
  • Image Preprocessing: The node automatically converts images to grayscale for better OCR accuracy
  • White Background: For base64 images, the node overlays on a white background to improve text recognition
  • Language Selection: Choose the correct language for best results. Using the wrong language will result in poor text extraction
  • Multi-Language Support: Use the Advanced Language option to process documents with multiple languages
  • File Formats: Supports common image formats including PNG, JPG, JPEG, BMP, TIFF, and more
  • Temporary Files: When using Image Path, the node creates a temporary grayscale image that is automatically cleaned up after processing
  • Whitespace Handling: Extracted text is automatically trimmed of leading and trailing whitespace

Best Practices

  1. Optimize Image Quality:

    • Use images with at least 300 DPI for printed documents
    • Ensure good contrast between text and background
    • Avoid blurry or low-resolution images
  2. Preprocess Images When Needed:

    • Use image editing tools to enhance contrast before OCR
    • Rotate skewed images to horizontal alignment
    • Crop unnecessary areas to focus on text regions
  3. Language Configuration:

    • Always set the correct language for the document being processed
    • For multilingual documents, list all languages in the Advanced Language field
    • Ensure required language data files are installed
  4. Error Handling:

    • Use "Continue On Error" for batch processing to avoid stopping on individual failures
    • Validate that extracted text is not empty before further processing
    • Log or handle cases where OCR confidence is low
  5. Performance Optimization:

    • Process images in batches when dealing with multiple files
    • Consider image size - very large images may take longer to process
    • Use appropriate language settings to avoid unnecessary processing time

Common Use Cases

  • Invoice Processing: Extract invoice numbers, dates, amounts, and vendor information from scanned invoices
  • Receipt Scanning: Digitize paper receipts for expense tracking and reporting
  • Form Data Extraction: Pull data from filled forms, applications, and surveys
  • Document Digitization: Convert scanned documents and PDFs to searchable text
  • ID Card Reading: Extract information from driver's licenses, passports, and ID cards
  • Screenshot Text Capture: Get text from application screenshots and error messages
  • Table Data Extraction: Extract data from tables in images (requires post-processing)
  • Handwritten Text Recognition: Limited support for clear handwriting (results may vary)