Process Text

Processes text fields from an image with advanced OCR settings for text type, marking, and writing style.

Common Properties

Name - The custom name of the node.
Color - The custom color of the node.
Delay Before (sec) - Waits in seconds before executing the node.
Delay After (sec) - Waits in seconds after executing node.
Continue On Error - Automation will continue regardless of any error. The default value is false.

Inputs

Image Path - Path to the image file containing text to be recognized.

Options

Text Type - Type of text in the image (default: normal). Options:
- Normal - Regular printed text
- Typewriter - Typewriter or monospaced fonts
- Matrix - Dot matrix printer text
- Index - Index or subscript text
- OCR-A - OCR-A font
- OCR-B - OCR-B font
- E13B - MICR E13B font (banking)
- CMC7 - MICR CMC7 font (banking)
- Gothic - Gothic or blackletter fonts
- Handprinted - Hand-printed text
Language - Language of the text (default: English). Supports over 200 languages.
Marking Type - Type of text marking/formatting (default: simpleText). Options:
- Simple Text - Unmarked text
- Underlined Text - Underlined characters
- Text In Frame - Text within a border
- Grey Boxes - Text on grey background
- Char Box Series - Each character in a box
- Simple Comb - Comb-like structure
- Comb In Frame - Comb structure with border
- Partitioned Frame - Divided frame structure
Writing Style - Regional writing style for recognition (default: default). Options include American, German, Russian, Polish, Thai, Japanese, Arabic, and many more regional styles.
Placeholders Count - Number of placeholders in the text field (default: 1).
Region - Optional region coordinates to limit recognition area (format: "left,top,right,bottom").
Letter Set - Optional allowed character set to restrict recognition.
Regular Expression - Optional regex pattern that the recognized text should match.
Description - Optional task description for reference.
PDF Password - Password for encrypted PDF files.
One Text Line - Treat the entire field as a single line of text (default: false).
One Word per Text Line - Expect one word per line (default: false).

Outputs

Task - ABBYY task object containing text recognition results.

How It Works

The Process Text node recognizes text fields with advanced settings for specialized text types and formats. When executed, the node:

Reads the image file from the specified path
Applies text type, marking type, and writing style settings
Optionally restricts recognition to a specific region
Applies character set or regex constraints if specified
Uploads the image and settings to ABBYY Cloud
Returns a task object with recognition results

Requirements

Valid ABBYY Cloud credentials
Valid image file at the specified path
Correct text type and marking type selections for your use case
Optional: region coordinates, letter set, or regex pattern

Error Handling

The node will return specific errors in the following cases:

Robomotion.ABBYYCloud.ErrImagePath - Image path is invalid or file not found
Robomotion.ABBYYCloud.ErrImageData - Cannot read image file
Robomotion.ABBYYCloud.ErrOption - Invalid option parameters
Robomotion.ABBYYCloud.ErrRegion - Invalid region format
Robomotion.ABBYYCloud.ErrLetterSet - Invalid letter set
Robomotion.ABBYYCloud.ErrRegExp - Invalid regular expression
Robomotion.ABBYYCloud.ErrDescription - Invalid description

Usage Example

Scenario: Recognize a handwritten form field with specific constraints

1. Process Text node:
   - Image Path: "C:/forms/application_001.jpg"
   - Text Type: Handprinted
   - Language: English
   - Marking Type: Char Box Series
   - Letter Set: "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
   - Region: "100,200,400,250"
   - One Text Line: true

2. Wait Task node:
   - Task: {{ $.task }}
   - Timeout: 60 seconds

Common Use Cases

Form Field Recognition - Extract text from specific form fields with constraints
Handwritten Forms - Recognize hand-printed text in form boxes
Comb Fields - Extract text from comb-style input fields (passports, IDs)
Banking Documents - Recognize MICR fonts (E13B, CMC7) on checks
Restricted Input - Recognize text with known character sets or patterns
Multi-Style Documents - Handle different text styles in the same document
Region-Specific OCR - Extract text from specific areas of an image

Tips and Best Practices

Text Type Selection: Choose the correct text type for best accuracy
- Use "Handprinted" for hand-filled forms
- Use "Matrix" for old dot matrix printouts
- Use "E13B" or "CMC7" for bank checks
Marking Type: Match the marking type to your form design
- Use "Char Box Series" for individual character boxes
- Use "Comb In Frame" for passport-style comb fields
Letter Set: Restrict to known characters for better accuracy
- Numbers only: "0123456789"
- Uppercase only: "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
- Alphanumeric: Combine letters and numbers
Regular Expression: Validate format during recognition
- Phone: "\d3-\d3-\d4"
- Date: "\d2/\d2/\d4"
- ID: "[A-Z]2\d6"
Region Coordinates: Use image editing tools to determine exact coordinates
Writing Style: Select regional style for locale-specific formatting
One Text Line: Enable for single-line fields to prevent line breaks

Common Properties​

Inputs​

Options​

Outputs​

How It Works​

Requirements​

Error Handling​

Usage Example​

Common Use Cases​

Tips and Best Practices​