Process Text
Processes text fields from an image with advanced OCR settings for text type, marking, and writing style.
Common Properties
- Name - The custom name of the node.
- Color - The custom color of the node.
- Delay Before (sec) - Waits in seconds before executing the node.
- Delay After (sec) - Waits in seconds after executing node.
- Continue On Error - Automation will continue regardless of any error. The default value is false.
Inputs
- Image Path - Path to the image file containing text to be recognized.
Options
- Text Type - Type of text in the image (default: normal). Options:
- Normal - Regular printed text
- Typewriter - Typewriter or monospaced fonts
- Matrix - Dot matrix printer text
- Index - Index or subscript text
- OCR-A - OCR-A font
- OCR-B - OCR-B font
- E13B - MICR E13B font (banking)
- CMC7 - MICR CMC7 font (banking)
- Gothic - Gothic or blackletter fonts
- Handprinted - Hand-printed text
- Language - Language of the text (default: English). Supports over 200 languages.
- Marking Type - Type of text marking/formatting (default: simpleText). Options:
- Simple Text - Unmarked text
- Underlined Text - Underlined characters
- Text In Frame - Text within a border
- Grey Boxes - Text on grey background
- Char Box Series - Each character in a box
- Simple Comb - Comb-like structure
- Comb In Frame - Comb structure with border
- Partitioned Frame - Divided frame structure
- Writing Style - Regional writing style for recognition (default: default). Options include American, German, Russian, Polish, Thai, Japanese, Arabic, and many more regional styles.
- Placeholders Count - Number of placeholders in the text field (default: 1).
- Region - Optional region coordinates to limit recognition area (format: "left,top,right,bottom").
- Letter Set - Optional allowed character set to restrict recognition.
- Regular Expression - Optional regex pattern that the recognized text should match.
- Description - Optional task description for reference.
- PDF Password - Password for encrypted PDF files.
- One Text Line - Treat the entire field as a single line of text (default: false).
- One Word per Text Line - Expect one word per line (default: false).
Outputs
- Task - ABBYY task object containing text recognition results.
How It Works
The Process Text node recognizes text fields with advanced settings for specialized text types and formats. When executed, the node:
- Reads the image file from the specified path
- Applies text type, marking type, and writing style settings
- Optionally restricts recognition to a specific region
- Applies character set or regex constraints if specified
- Uploads the image and settings to ABBYY Cloud
- Returns a task object with recognition results
Requirements
- Valid ABBYY Cloud credentials
- Valid image file at the specified path
- Correct text type and marking type selections for your use case
- Optional: region coordinates, letter set, or regex pattern
Error Handling
The node will return specific errors in the following cases:
Robomotion.ABBYYCloud.ErrImagePath- Image path is invalid or file not foundRobomotion.ABBYYCloud.ErrImageData- Cannot read image fileRobomotion.ABBYYCloud.ErrOption- Invalid option parametersRobomotion.ABBYYCloud.ErrRegion- Invalid region formatRobomotion.ABBYYCloud.ErrLetterSet- Invalid letter setRobomotion.ABBYYCloud.ErrRegExp- Invalid regular expressionRobomotion.ABBYYCloud.ErrDescription- Invalid description
Usage Example
Scenario: Recognize a handwritten form field with specific constraints
1. Process Text node:
- Image Path: "C:/forms/application_001.jpg"
- Text Type: Handprinted
- Language: English
- Marking Type: Char Box Series
- Letter Set: "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
- Region: "100,200,400,250"
- One Text Line: true
2. Wait Task node:
- Task: {{ $.task }}
- Timeout: 60 seconds
Common Use Cases
- Form Field Recognition - Extract text from specific form fields with constraints
- Handwritten Forms - Recognize hand-printed text in form boxes
- Comb Fields - Extract text from comb-style input fields (passports, IDs)
- Banking Documents - Recognize MICR fonts (E13B, CMC7) on checks
- Restricted Input - Recognize text with known character sets or patterns
- Multi-Style Documents - Handle different text styles in the same document
- Region-Specific OCR - Extract text from specific areas of an image
Tips and Best Practices
- Text Type Selection: Choose the correct text type for best accuracy
- Use "Handprinted" for hand-filled forms
- Use "Matrix" for old dot matrix printouts
- Use "E13B" or "CMC7" for bank checks
- Marking Type: Match the marking type to your form design
- Use "Char Box Series" for individual character boxes
- Use "Comb In Frame" for passport-style comb fields
- Letter Set: Restrict to known characters for better accuracy
- Numbers only: "0123456789"
- Uppercase only: "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
- Alphanumeric: Combine letters and numbers
- Regular Expression: Validate format during recognition
- Phone: "\d3-\d3-\d4"
- Date: "\d2/\d2/\d4"
- ID: "[A-Z]2\d6"
- Region Coordinates: Use image editing tools to determine exact coordinates
- Writing Style: Select regional style for locale-specific formatting
- One Text Line: Enable for single-line fields to prevent line breaks