ABBYY FineReader SDK

ABBYY FineReader SDK package provides desktop-based OCR and document processing capabilities using the ABBYY FineReader Engine. Process documents, extract MRZ data, enhance camera images, classify documents, and more with enterprise-grade accuracy.

Prerequisites

Before using ABBYY FineReader SDK nodes, you need to:

Install ABBYY FineReader Engine on your Windows machine
Obtain a valid ABBYY FineReader Engine license
Configure the license in the config.yaml file
Ensure the FineReader Engine DLLs are accessible

Note: ABBYY FineReader SDK is currently only available for Windows platform.

System Requirements

Platform: Windows (64-bit)
ABBYY FineReader Engine: Version 12 or higher
.NET Framework: 4.7.2 or higher
License: Valid ABBYY FineReader Engine license

Available Nodes

Document Processing

Process Document - Perform OCR on documents with advanced options and export to various formats
Process MRZ - Extract Machine Readable Zone data from passports and ID cards
Split Document - Split multi-page documents into individual page image files

Image Enhancement

Camera OCR - Process camera photos with image enhancement and preprocessing

Document Classification

Classify Document - Classify documents using trained ABBYY classification models
Train Model - Train document classification models using labeled training data

Key Features

Advanced OCR

Multi-language support (200+ languages)
Multiple text types (normal, typewriter, handprinted, etc.)
High accuracy recognition with ABBYY's enterprise engine

Image Processing

Automatic orientation correction
Skew correction
Noise removal with multiple models
Geometric distortion correction
Motion blur removal

Export Formats

Documents: DOCX, XLSX, PPTX, RTF
PDF: Searchable PDF, PDF with text and images, PDF/A
Text: TXT (structured and unstructured)
Structured: XML

Document Classification

Train custom classification models
Classify documents by type
Support for image and text-based classification
Cross-validation for accuracy assessment

Common Workflow Patterns

Simple Document OCR

1. Process Document
   - Input: Scanned document image
   - Output: Searchable PDF or DOCX

Camera Image Enhancement

1. Camera OCR
   - Input: Photo taken with smartphone
   - Enhancement: Deskew, remove blur, correct orientation
   - Output: Clean image + OCR statistics

Passport/ID Processing

1. Process MRZ
   - Input: Passport or ID card image
   - Output: Structured MRZ data (XML/JSON)

Document Classification

1. Train Model
   - Input: Labeled training documents
   - Output: Classification model file

2. Classify Document
   - Input: Document + trained model
   - Output: Document category/label

Multi-Page Processing

1. Split Document
   - Input: Multi-page PDF or TIFF
   - Output: Individual page images

2. Process each page as needed

Comparison: Cloud vs FineReader SDK

Feature	ABBYY Cloud	FineReader SDK
Deployment	Cloud-based API	On-premise desktop
Platform	Cross-platform	Windows only
Internet	Required	Not required
Cost Model	Pay per use (credits)	License-based
Processing	Server-side	Local machine
Privacy	Data sent to cloud	Data stays local
Performance	Depends on network	Local processing speed
Setup	Minimal (credentials)	Install engine + license
Use Case	Scalable cloud workflows	Desktop automation, offline

Configuration

config.yaml

# ABBYY FineReader Engine configuration
# License and engine settings are configured here

Engine Profiles

The package supports multiple processing profiles:

DocumentConversion_Accuracy - Best quality, slower processing
DocumentConversion_Speed - Faster processing, good quality
DocumentArchiving - Optimized for archival
TextExtraction - Extract text only

Best Practices

License Management:
- Ensure valid license before processing
- Monitor license expiration
- Handle license errors gracefully
Image Quality:
- Use high-resolution images (300+ DPI)
- Ensure good lighting and contrast
- Preprocess images if needed
Language Selection:
- Always specify correct language
- Use language matching document content
- Supports multiple languages simultaneously
Performance:
- Local processing is fast for single documents
- Large documents may require significant time
- Consider parallel processing for batches
Error Handling:
- Enable Continue On Error for batch processing
- Validate input files before processing
- Check output file generation

Error Handling

Common errors:

ErrNotFound - Input file not found or MRZ not detected
ErrInvalidArg - Invalid option or parameter
Engine initialization errors - License or installation issues

Supported Languages

ABBYY FineReader Engine supports 200+ languages including:

All major European languages
Chinese (Simplified and Traditional)
Japanese, Korean
Arabic, Hebrew, Thai, Vietnamese
And many more...

Tips for Best Results

Document OCR

Use appropriate profile for your use case
Enable corrections (orientation, skew) for scanned documents
Match text type to document characteristics

Camera Images

Enable all enhancement options for smartphone photos
Use noise removal for low-light images
Correct geometric distortions for angled shots

MRZ Processing

Ensure entire MRZ is visible
Use high resolution for small text
Keep document flat during capture

Classification

Provide diverse training samples
Use cross-validation to assess accuracy
Train with representative documents

📄️ Camera OCR

Robomotion.FineReader.CameraOCR

📄️ Classify Document

Robomotion.FineReader.Classify

📄️ Process Document

Robomotion.FineReader.ProcessDocument

📄️ Process MRZ

Robomotion.FineReader.ProcessMRZ

📄️ Split Document

Robomotion.FineReader.SplitDocument

📄️ Train Model

Robomotion.FineReader.Train

Prerequisites​

System Requirements​

Available Nodes​

Document Processing​

Image Enhancement​

Document Classification​

Key Features​

Advanced OCR​

Image Processing​

Export Formats​

Document Classification​

Common Workflow Patterns​

Simple Document OCR​

Camera Image Enhancement​

Passport/ID Processing​

Document Classification​

Multi-Page Processing​

Comparison: Cloud vs FineReader SDK​

Configuration​

config.yaml​

Engine Profiles​

Best Practices​

Error Handling​

Supported Languages​

Tips for Best Results​

Document OCR​

Camera Images​

MRZ Processing​

Classification​

📄️ Camera OCR

📄️ Classify Document

📄️ Process Document

📄️ Process MRZ

📄️ Split Document

📄️ Train Model

Prerequisites

System Requirements

Available Nodes

Document Processing

Image Enhancement

Document Classification

Key Features

Advanced OCR

Image Processing

Export Formats

Document Classification

Common Workflow Patterns

Simple Document OCR

Camera Image Enhancement

Passport/ID Processing

Document Classification

Multi-Page Processing

Comparison: Cloud vs FineReader SDK

Configuration

config.yaml

Engine Profiles

Best Practices

Error Handling

Supported Languages

Tips for Best Results

Document OCR

Camera Images

MRZ Processing

Classification