Extract Text

Extracts text content from documents using Google Document AI's OCR and text recognition capabilities, converting PDFs, images, and scanned documents into searchable, machine-readable text.

Common Properties

Name - The custom name of the node.
Color - The custom color of the node.
Delay Before (sec) - Waits in seconds before executing the node.
Delay After (sec) - Waits in seconds after executing node.
Continue On Error - Automation will continue regardless of any error. The default value is false.

info

If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.

Inputs

File Path - The local file path of the document to process. Supports PDF, PNG, JPG, JPEG, TIFF, GIF, BMP, and WEBP formats.
MIME Type - The MIME type of the document (e.g., application/pdf, image/png, image/jpeg). If left empty, the MIME type will be automatically detected from the file content.

Output

Text - The complete extracted text from all pages concatenated together as a single string. Preserves the reading order of the original document.
Pages - An array of page objects, where each object contains:
- page (number): The page number (starting from 1)
- text (string): The extracted text content for that specific page

Options

Credentials - Service Account credentials for Document AI API authentication. Select from Robomotion vault or provide the JSON key file content.
Project Id - The Google Cloud project ID where Document AI is enabled (e.g., "my-project-123").
Location - The processor location/region. Default is "us". Available options: "us", "eu", "asia". Choose the region where your processor is deployed.
Processor Id - The Document AI processor ID to use for text extraction (e.g., "a1b2c3d4e5f6g7h8"). Found in Google Cloud Console under Document AI processors.

How It Works

The Extract Text node integrates with Google Document AI to extract text from documents. When executed, the node:

Validates the provided file path and checks file accessibility
Detects MIME type automatically if not specified
Authenticates with Google Document AI using the provided Service Account credentials
Reads the document file content from the local file system
Sends the document to the specified Document AI processor
Processes the document using OCR and machine learning models
Extracts text while preserving the reading order and layout structure
Organizes results by page and returns both full text and page-level text
Sets output variables for use in subsequent flow nodes

Requirements

Google Cloud Setup:
- Active Google Cloud project with billing enabled
- Document AI API enabled in the project
- OCR processor created and deployed in Document AI
- Service Account with Document AI User role
Robomotion Setup:
- Service Account JSON key stored in Robomotion vault
- Document file accessible on the local file system
- File size under 20MB limit

Practical Examples

Example 1: Extract Text from Scanned Invoice

// Extract text from a scanned invoice PDF
// Outputs: Full text in $text, pages array in $pages

// Access full text
const fullText = $text;
console.log("Extracted Text:", fullText);

// Search for invoice number in text
const invoiceMatch = fullText.match(/Invoice #:\s*(\w+)/);
if (invoiceMatch) {
  const invoiceNumber = invoiceMatch[1];
  console.log("Invoice Number:", invoiceNumber);
}

// Process each page separately
$pages.forEach(page => {
  console.log(`Page ${page.page}:`, page.text);
});

Example 2: Convert Multi-Page PDF to Text Files

// Extract text and save each page to separate file
const fs = require('fs');

$pages.forEach(page => {
  const fileName = `page_${page.page}.txt`;
  fs.writeFileSync(fileName, page.text);
  console.log(`Saved ${fileName}`);
});

// Also save the full document
fs.writeFileSync('full_document.txt', $text);

Example 3: Search and Validate Document Content

// Extract and validate required fields from a form
const requiredFields = ['Name:', 'Date:', 'Signature:'];
const missingFields = [];

requiredFields.forEach(field => {
  if (!$text.includes(field)) {
    missingFields.push(field);
  }
});

if (missingFields.length > 0) {
  throw new Error(`Missing fields: ${missingFields.join(', ')}`);
}

console.log("All required fields present");

Example 4: Extract Text from Image and Process

// Extract text from image document (e.g., photographed receipt)
// File Path: /downloads/receipt.jpg
// MIME Type: (auto-detected)

// Clean and process extracted text
const cleanText = $text
  .replace(/\n+/g, ' ')  // Replace newlines with spaces
  .replace(/\s+/g, ' ')   // Normalize whitespace
  .trim();

// Extract total amount using regex
const totalMatch = cleanText.match(/Total:?\s*\$?([\d,]+\.?\d*)/i);
if (totalMatch) {
  const total = parseFloat(totalMatch[1].replace(',', ''));
  console.log("Total Amount:", total);
}

Tips for Effective Use

Document Preparation

Image Quality: Use high-resolution scans (300 DPI or higher) for best OCR accuracy
File Format: PDF format generally provides better results than images
Orientation: Ensure documents are properly oriented (not rotated or upside down)
Lighting: For photographed documents, use even lighting without shadows or glare

MIME Type Handling

Auto-detection works well for most cases
Explicitly specify MIME type for better performance when processing many files
Common MIME types:
- PDF: application/pdf
- PNG: image/png
- JPEG: image/jpeg
- TIFF: image/tiff

Processing Large Documents

Documents with more than 15 pages may require splitting
Consider processing page ranges separately for very large documents
Monitor processing time (default timeout is 240 seconds)

Text Cleanup

Extracted text may contain extra whitespace or line breaks
Use string manipulation to normalize formatting
Consider using regular expressions for pattern matching
Handle special characters and encoding properly

Error Prevention

Verify File Paths: Always check that file exists before processing
Check File Size: Keep documents under 20MB limit
Validate Credentials: Test credentials with a simple document first
Region Matching: Ensure processor location matches the region in your Project ID

Common Errors and Solutions

Error: "File path cannot be empty"

Cause: No file path provided to the node. Solution: Ensure the File Path input is populated with a valid path.

Error: "Failed to read document file"

Cause: File not found, permission denied, or path is incorrect. Solution:

Verify the file exists at the specified path
Check file permissions allow reading
Use absolute paths instead of relative paths
Ensure the file hasn't been moved or deleted

Error: "Project ID cannot be empty"

Cause: Project ID option is not configured. Solution: Add your Google Cloud project ID in the node options.

Error: "Processor ID cannot be empty"

Cause: Processor ID option is not configured. Solution:

Create a processor in Google Cloud Console
Copy the processor ID from the processor details page
Add it to the node options

Error: "Invalid credentials format: missing content field"

Cause: Credentials are not properly formatted or stored. Solution:

Re-download Service Account JSON key from Google Cloud Console
Save the complete JSON content in Robomotion vault
Select the correct credential from the dropdown

Error: "Failed to process document: permission denied"

Cause: Service Account lacks necessary permissions. Solution:

Ensure Service Account has "Document AI API User" role
Verify Document AI API is enabled in the project
Check that the processor ID belongs to the specified project

Error: "Request timeout"

Cause: Document processing took longer than timeout period. Solution:

Reduce document size or page count
Check network connectivity
Try processing during off-peak hours
Contact support if timeout persists with normal documents

Use Cases

Document Digitization

Convert paper archives, historical documents, or legacy files into searchable digital text for indexing and retrieval.

Content Extraction

Extract text from contracts, agreements, or legal documents for analysis, comparison, or storage in databases.

Receipt Processing

Convert receipt images to text for expense tracking, accounting systems, or financial analysis.

Form Processing

Extract text from filled forms for data entry automation, validation, or integration with business systems.

Academic Research

Process research papers, books, or manuscripts for text analysis, citation extraction, or digital preservation.

Accessibility

Convert scanned documents or images to text for screen readers and accessibility tools.

Performance Considerations

Processing Time: Typically 2-10 seconds per page depending on complexity
Concurrent Requests: Document AI has rate limits; implement queuing for batch processing
File Size: Larger files take longer; consider splitting very large PDFs
Network Latency: Choose processor location near your automation deployment
Cost: Charged per page processed; monitor usage in Google Cloud Console

Common Properties​

Inputs​

Output​

Options​

How It Works​

Requirements​

Practical Examples​

Example 1: Extract Text from Scanned Invoice​

Example 2: Convert Multi-Page PDF to Text Files​

Example 3: Search and Validate Document Content​

Example 4: Extract Text from Image and Process​

Tips for Effective Use​

Document Preparation​

MIME Type Handling​

Processing Large Documents​

Text Cleanup​

Error Prevention​

Common Errors and Solutions​

Error: "File path cannot be empty"​

Error: "Failed to read document file"​

Error: "Project ID cannot be empty"​

Error: "Processor ID cannot be empty"​

Error: "Invalid credentials format: missing content field"​

Error: "Failed to process document: permission denied"​

Error: "Request timeout"​

Use Cases​

Document Digitization​

Content Extraction​

Receipt Processing​

Form Processing​

Academic Research​

Accessibility​

Performance Considerations​