Image To Text
Extracts text from images using Tesseract OCR (Optical Character Recognition). Supports both image file paths and base64-encoded image data with multi-language support.
Common Properties
- Name - The custom name of the node.
- Color - The custom color of the node.
- Delay Before (sec) - Waits in seconds before executing the node.
- Delay After (sec) - Waits in seconds after executing node.
- Continue On Error - Automation will continue regardless of any error. The default value is false.
Inputs
- Image Path - Path to the image file to extract text from (string).
- Base64 Data - Base64 encoded image data to extract text from (string).
You must provide either Image Path or Base64 Data. At least one input is required.
Options
-
Language - Language to use for OCR text extraction (default: English). Select from over 100 supported languages including:
- English (eng)
- Spanish (spa)
- French (fra)
- German (deu)
- Chinese Simplified (chi_sim)
- Japanese (jpn)
- Arabic (ara)
- And many more...
-
Advanced Language - Enter multiple languages using language codes separated by plus signs (e.g.,
eng+arafor English and Arabic). This field overrides the Language option when provided. Find language codes at: https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html
Outputs
- Text - The extracted text from the image (string).
How It Works
The Image To Text node extracts text from images using Tesseract OCR technology. When executed, the node:
- Validates that Tesseract is installed and available in the system PATH
- Determines which input source to use (Image Path or Base64 Data)
- For base64 input:
- Decodes the base64 string to an image
- Overlays the image on a white background for better OCR accuracy
- Processes the image with Tesseract
- For file path input:
- Reads the image file using OpenCV
- Converts the image to grayscale for improved OCR performance
- Saves a temporary processed image
- Processes the image with Tesseract
- Cleans up the temporary file
- Returns the extracted text with leading and trailing whitespace removed
Requirements
- Tesseract OCR must be installed on the system and available in PATH
- Valid image file (for Image Path input) in supported formats (PNG, JPG, BMP, TIFF, etc.)
- Valid base64-encoded image data (for Base64 Data input)
- Language data files must be installed for the selected language
Error Handling
The node will return specific errors in the following cases:
- Tesseract not found - "Tesseract not found or not in PATH. You can install it from https://tesseract-ocr.github.io/tessdoc/Installation.html"
- File not found - "No such file" (when the specified image path doesn't exist)
- Empty inputs - "Both Image Path and Base64 Data can not be empty. You have to give at least one of them"
- Missing language - "Language can not be empty. You should select the language or provide advanced language"
Usage Examples
Example 1: Extract Text from an Image File
// Extract text from a saved screenshot
$R.flow.imagePath = "/path/to/invoice.png";
// Language is set to English in the node options
// After execution, text will be available in $R.message.text
Example 2: Extract Text from Base64 Image
// Extract text from a base64 encoded image (e.g., from a web service)
$R.flow.imageData = "...";
// The node will decode and process the image
// Extracted text will be in $R.message.text
Example 3: Multi-Language Text Extraction
// Extract text from an image containing both English and Arabic
// Set Advanced Language option to: eng+ara
$R.flow.documentPath = "/path/to/multilingual-document.png";
// The OCR will recognize both English and Arabic text
Example 4: Processing Multiple Images in a Loop
// Get list of image files
const images = [
"/path/to/receipt1.jpg",
"/path/to/receipt2.jpg",
"/path/to/receipt3.jpg"
];
// Loop through each image
for (const imagePath of images) {
$R.flow.currentImage = imagePath;
// After Image To Text node executes:
// $R.message.text will contain the extracted text
// You can save it to a variable or process it further
}
Usage Notes
- Image Quality: Higher resolution images with clear text produce better results
- Image Preprocessing: The node automatically converts images to grayscale for better OCR accuracy
- White Background: For base64 images, the node overlays on a white background to improve text recognition
- Language Selection: Choose the correct language for best results. Using the wrong language will result in poor text extraction
- Multi-Language Support: Use the Advanced Language option to process documents with multiple languages
- File Formats: Supports common image formats including PNG, JPG, JPEG, BMP, TIFF, and more
- Temporary Files: When using Image Path, the node creates a temporary grayscale image that is automatically cleaned up after processing
- Whitespace Handling: Extracted text is automatically trimmed of leading and trailing whitespace
Best Practices
-
Optimize Image Quality:
- Use images with at least 300 DPI for printed documents
- Ensure good contrast between text and background
- Avoid blurry or low-resolution images
-
Preprocess Images When Needed:
- Use image editing tools to enhance contrast before OCR
- Rotate skewed images to horizontal alignment
- Crop unnecessary areas to focus on text regions
-
Language Configuration:
- Always set the correct language for the document being processed
- For multilingual documents, list all languages in the Advanced Language field
- Ensure required language data files are installed
-
Error Handling:
- Use "Continue On Error" for batch processing to avoid stopping on individual failures
- Validate that extracted text is not empty before further processing
- Log or handle cases where OCR confidence is low
-
Performance Optimization:
- Process images in batches when dealing with multiple files
- Consider image size - very large images may take longer to process
- Use appropriate language settings to avoid unnecessary processing time
Common Use Cases
- Invoice Processing: Extract invoice numbers, dates, amounts, and vendor information from scanned invoices
- Receipt Scanning: Digitize paper receipts for expense tracking and reporting
- Form Data Extraction: Pull data from filled forms, applications, and surveys
- Document Digitization: Convert scanned documents and PDFs to searchable text
- ID Card Reading: Extract information from driver's licenses, passports, and ID cards
- Screenshot Text Capture: Get text from application screenshots and error messages
- Table Data Extraction: Extract data from tables in images (requires post-processing)
- Handwritten Text Recognition: Limited support for clear handwriting (results may vary)