Skip to main content

Image To Text

Extracts text from images using Google Vision API's optical character recognition (OCR) feature.

Common Properties

  • Name - The custom name of the node.
  • Color - The custom color of the node.
  • Delay Before (sec) - Waits in seconds before executing the node.
  • Delay After (sec) - Waits in seconds after executing node.
  • Continue On Error - Automation will continue regardless of any error. The default value is false.
info

If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.

Inputs

  • Vision Client Id - The unique identifier of the Vision API connection, typically obtained from the Connect node.
  • Image Path - The file path to the image from which to extract text.

Options

  • Credentials - Google Cloud service account credentials (optional - use instead of Connect node). If provided, the node will create its own client connection without requiring a Vision Client ID.

Output

  • Confidence - An object containing confidence scores for each page of text detected.
  • Text - The extracted text from the image, or "No text found" if no text is detected.

How It Works

The Image To Text node uses optical character recognition (OCR) to extract text from images using Google Vision API. When executed, the node:

  1. Retrieves the Vision API client using the provided client ID
  2. Validates that the image path is not empty
  3. Opens and reads the image file from the specified path
  4. Creates a Vision API image object from the file
  5. Calls the DetectDocumentText method to extract text from the image
  6. Processes the results and returns both the extracted text and confidence scores

Requirements

  • A valid connection to Vision API established with the Connect node
  • Valid Google Cloud credentials with appropriate permissions
  • An image file accessible from the specified path
  • Enabled Vision API in your Google Cloud project

Error Handling

The node will return specific errors in the following cases:

  • Empty or invalid Vision Client ID
  • Empty image path
  • Invalid image file path
  • File read errors
  • Invalid image format
  • Network connectivity issues
  • Vision API service errors
  • Authentication failures

Usage Notes

  • The Vision Client ID must be obtained from a successful Connect node execution (or provide credentials directly)
  • The image file must be accessible from the specified path
  • Supported image formats include JPEG, PNG, GIF, BMP, TIFF, and WebP
  • Works with printed and handwritten text in many languages
  • The node returns the complete text found in the image
  • Confidence scores indicate the reliability of the text detection for each page (0.0 to 1.0)
  • If no text is found, the output will be "No text found"
  • Text extraction quality depends on image quality, font, and text clarity
  • For PDFs stored in Google Cloud Storage, use the PDF to Text node instead

Example Use Cases

Extract Text from Invoice

Input:
Vision Client Id: (from Connect node)
Image Path: /invoices/invoice_march_2024.jpg

Output:
Text: "INVOICE\nInvoice #: INV-2024-001\nDate: March 15, 2024\nTotal: $1,250.00..."
Confidence: {1: 0.98}

Process Business Cards in Bulk

1. File System > List Files
Directory: /business_cards
Pattern: *.jpg
Output: file_list

2. Loop through file_list
For each file:
- Image to Text
Image Path: {{file}}
Output: contact_text
- Save to database or spreadsheet

Digitize Handwritten Forms

Input:
Vision Client Id: (from Connect node)
Image Path: /forms/application_form_001.png

Output:
Text: "Name: John Smith\nAddress: 123 Main St\nPhone: 555-0123..."
Confidence: {1: 0.87}

Note: Confidence score of 0.87 indicates high confidence in OCR accuracy

Extract Receipt Data for Expense Tracking

1. Image to Text
Image Path: /receipts/receipt_{{date}}.jpg
Output: receipt_text, confidence

2. Programming > Evaluate
Code:
const text = message.receipt_text;
const totalMatch = text.match(/Total:?\s*\$?([\d.]+)/i);
message.amount = totalMatch ? parseFloat(totalMatch[1]) : 0;
message.vendor = text.split('\n')[0]; // First line usually vendor name

3. Excel > Append Row
Data: {{date}}, {{vendor}}, {{amount}}

Screenshot OCR for Automation

1. Browser > Screenshot
Selector: .invoice-section
Output: screenshot_path

2. Image to Text
Image Path: {{screenshot_path}}
Output: extracted_data

3. Use extracted_data in subsequent automation steps

Tips

  • Image Quality: Use high-resolution images (at least 300 DPI) for best OCR accuracy
  • Preprocessing: For poor-quality images, consider preprocessing (contrast adjustment, noise reduction)
  • Orientation: Ensure text is right-side up - Vision API handles slight rotation but not 90/180 degree flips
  • Multi-language: Vision API automatically detects language, no configuration needed
  • Confidence Scores: Use confidence scores to determine if manual review is needed (e.g., < 0.8)
  • File Paths: Use absolute paths or ensure relative paths are correctly resolved
  • Large Images: Vision API handles large images, but consider resizing extremely large files for faster processing
  • Handwriting: Works with neat handwriting; messy handwriting may have lower accuracy
  • Direct Credentials: For single-use scenarios, provide credentials directly instead of using Connect node

Common Errors and Solutions

Error: "Image Path cannot be empty"

Solution: Ensure the Image Path input is populated with a valid file path

Error: "No such file or directory"

Solution: Verify the file path is correct and the file exists. Use absolute paths for reliability.

Error: "Invalid Client"

Solution: Ensure Connect node ran successfully and Vision Client ID is properly passed to this node, or provide credentials directly

Output: "No text found"

Solution:

  • Check if the image actually contains text
  • Verify image quality and clarity
  • Ensure text is visible and not too small
  • Try with a different image format

Low Confidence Scores

Solution:

  • Improve image quality (resolution, lighting, contrast)
  • Ensure text is clearly visible
  • Remove image noise or artifacts
  • Use original images instead of photocopies when possible