Image To Text

Extracts text from images using Tesseract OCR (Optical Character Recognition). Supports both image file paths and base64-encoded image data with multi-language support.

Common Properties

Name - The custom name of the node.
Color - The custom color of the node.
Delay Before (sec) - Waits in seconds before executing the node.
Delay After (sec) - Waits in seconds after executing node.
Continue On Error - Automation will continue regardless of any error. The default value is false.

Inputs

Image Path - Path to the image file to extract text from (string).
Base64 Data - Base64 encoded image data to extract text from (string).

note

You must provide either Image Path or Base64 Data. At least one input is required.

Options

Language - Language to use for OCR text extraction (default: English). Select from over 100 supported languages including:
- English (eng)
- Spanish (spa)
- French (fra)
- German (deu)
- Chinese Simplified (chi_sim)
- Japanese (jpn)
- Arabic (ara)
- And many more...
Advanced Language - Enter multiple languages using language codes separated by plus signs (e.g., eng+ara for English and Arabic). This field overrides the Language option when provided. Find language codes at: https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html

Outputs

Text - The extracted text from the image (string).

How It Works

The Image To Text node extracts text from images using Tesseract OCR technology. When executed, the node:

Validates that Tesseract is installed and available in the system PATH
Determines which input source to use (Image Path or Base64 Data)
For base64 input:
- Decodes the base64 string to an image
- Overlays the image on a white background for better OCR accuracy
- Processes the image with Tesseract
For file path input:
- Reads the image file using OpenCV
- Converts the image to grayscale for improved OCR performance
- Saves a temporary processed image
- Processes the image with Tesseract
- Cleans up the temporary file
Returns the extracted text with leading and trailing whitespace removed

Requirements

Tesseract OCR must be installed on the system and available in PATH
Valid image file (for Image Path input) in supported formats (PNG, JPG, BMP, TIFF, etc.)
Valid base64-encoded image data (for Base64 Data input)
Language data files must be installed for the selected language

Error Handling

The node will return specific errors in the following cases:

Tesseract not found - "Tesseract not found or not in PATH. You can install it from https://tesseract-ocr.github.io/tessdoc/Installation.html"
File not found - "No such file" (when the specified image path doesn't exist)
Empty inputs - "Both Image Path and Base64 Data can not be empty. You have to give at least one of them"
Missing language - "Language can not be empty. You should select the language or provide advanced language"

Usage Examples

Example 1: Extract Text from an Image File

// Extract text from a saved screenshot
$R.flow.imagePath = "/path/to/invoice.png";
// Language is set to English in the node options
// After execution, text will be available in $R.message.text

Example 2: Extract Text from Base64 Image

// Extract text from a base64 encoded image (e.g., from a web service)
$R.flow.imageData = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...";
// The node will decode and process the image
// Extracted text will be in $R.message.text

Example 3: Multi-Language Text Extraction

// Extract text from an image containing both English and Arabic
// Set Advanced Language option to: eng+ara
$R.flow.documentPath = "/path/to/multilingual-document.png";
// The OCR will recognize both English and Arabic text

Example 4: Processing Multiple Images in a Loop

// Get list of image files
const images = [
  "/path/to/receipt1.jpg",
  "/path/to/receipt2.jpg",
  "/path/to/receipt3.jpg"
];

// Loop through each image
for (const imagePath of images) {
  $R.flow.currentImage = imagePath;
  // After Image To Text node executes:
  // $R.message.text will contain the extracted text
  // You can save it to a variable or process it further
}

Usage Notes

Image Quality: Higher resolution images with clear text produce better results
Image Preprocessing: The node automatically converts images to grayscale for better OCR accuracy
White Background: For base64 images, the node overlays on a white background to improve text recognition
Language Selection: Choose the correct language for best results. Using the wrong language will result in poor text extraction
Multi-Language Support: Use the Advanced Language option to process documents with multiple languages
File Formats: Supports common image formats including PNG, JPG, JPEG, BMP, TIFF, and more
Temporary Files: When using Image Path, the node creates a temporary grayscale image that is automatically cleaned up after processing
Whitespace Handling: Extracted text is automatically trimmed of leading and trailing whitespace

Best Practices

Optimize Image Quality:
- Use images with at least 300 DPI for printed documents
- Ensure good contrast between text and background
- Avoid blurry or low-resolution images
Preprocess Images When Needed:
- Use image editing tools to enhance contrast before OCR
- Rotate skewed images to horizontal alignment
- Crop unnecessary areas to focus on text regions
Language Configuration:
- Always set the correct language for the document being processed
- For multilingual documents, list all languages in the Advanced Language field
- Ensure required language data files are installed
Error Handling:
- Use "Continue On Error" for batch processing to avoid stopping on individual failures
- Validate that extracted text is not empty before further processing
- Log or handle cases where OCR confidence is low
Performance Optimization:
- Process images in batches when dealing with multiple files
- Consider image size - very large images may take longer to process
- Use appropriate language settings to avoid unnecessary processing time

Common Use Cases

Invoice Processing: Extract invoice numbers, dates, amounts, and vendor information from scanned invoices
Receipt Scanning: Digitize paper receipts for expense tracking and reporting
Form Data Extraction: Pull data from filled forms, applications, and surveys
Document Digitization: Convert scanned documents and PDFs to searchable text
ID Card Reading: Extract information from driver's licenses, passports, and ID cards
Screenshot Text Capture: Get text from application screenshots and error messages
Table Data Extraction: Extract data from tables in images (requires post-processing)
Handwritten Text Recognition: Limited support for clear handwriting (results may vary)

Common Properties​

Inputs​

Options​

Outputs​

How It Works​

Requirements​

Error Handling​

Usage Examples​

Example 1: Extract Text from an Image File​

Example 2: Extract Text from Base64 Image​

Example 3: Multi-Language Text Extraction​

Example 4: Processing Multiple Images in a Loop​

Usage Notes​

Best Practices​

Common Use Cases​