Google Vision

Google Cloud Vision API provides powerful image analysis capabilities using machine learning, enabling you to detect objects, extract text, analyze content safety, and understand images at scale.

Overview

The Robomotion Google Vision package provides comprehensive integration with Google Cloud Vision API, enabling you to:

Extract text from images using OCR (Optical Character Recognition)
Extract text from PDF documents stored in Google Cloud Storage
Detect and extract labels describing image content
Analyze images for potentially unsafe content
Process images for automation workflows

Key Features

Text Extraction (OCR)

Image to Text: Extract printed and handwritten text from images
PDF to Text: Extract text from multi-page PDF documents
Multi-language Support: Recognize text in multiple languages
High Accuracy: Industry-leading OCR accuracy
Confidence Scores: Per-page confidence metrics

Image Understanding

Label Detection: Automatically identify objects, locations, activities, and concepts
Safe Search Detection: Detect adult, violent, medical, racy, and spoof content
Batch Processing: Analyze multiple images efficiently

Document Processing

PDF Processing: Extract structured text from PDF documents
Cloud Storage Integration: Work directly with files in Google Cloud Storage
Asynchronous Processing: Handle large documents efficiently

Authentication Options

The package supports two authentication methods:

Connect Node + Credentials: Establish a persistent connection using Google Cloud service account credentials
Direct Credentials: Provide credentials directly to each node (useful for one-off operations)

Getting Started

Basic Workflow

Connect: Establish a connection using the Connect node with your Google Cloud credentials
Analyze: Use any Vision node (Image to Text, Extract Labels, etc.) with the connection ID
Process Results: Access extracted data through output variables

Alternative Workflow (Direct Credentials)

Configure Node: Add credentials directly to the Vision node (ImageToText, ExtractImageLabels, etc.)
Process: Run the node without a separate Connect node
Get Results: Access output variables

Common Use Cases

Document Automation

Digitize scanned invoices and receipts
Extract data from forms and applications
Process insurance claims documents
Archive physical documents as searchable text

Content Moderation

Filter user-uploaded images for inappropriate content
Automate content review workflows
Flag potentially unsafe images for manual review
Classify content by safety categories

Image Organization

Auto-tag photos with descriptive labels
Categorize product images
Build searchable image databases
Generate image metadata automatically

Data Entry Automation

Extract text from business cards
Process handwritten forms
Digitize paper records
Import data from image-based documents

Requirements

Google Cloud Platform account
Vision API enabled in your GCP project
Service account with appropriate permissions:
- roles/cloudvision.user or roles/cloudvision.admin
For PDF processing: Google Cloud Storage bucket with read/write access

Supported Image Formats

JPEG
PNG
GIF
BMP
TIFF
WebP
RAW

Best Practices

Image Quality

Use high-resolution images for better OCR accuracy
Ensure good lighting and contrast
Avoid blurry or distorted images
Keep text orientation upright when possible

Performance

Use the Connect node to reuse connections across multiple operations
Batch similar operations together
Consider image file sizes for faster processing
Use Google Cloud Storage for large PDF files

Error Handling

Enable "Continue On Error" for processing multiple images
Validate image paths before processing
Handle "No text found" scenarios gracefully
Monitor API quotas and limits

Cost Optimization

Process only necessary images
Use appropriate image resolution (higher isn't always better)
Cache results when possible
Monitor API usage in Google Cloud Console

Available Nodes

📄️ Connect

Robomotion.GoogleVision.Connect

📄️ Extract Image Labels

Robomotion.GoogleVision.ExtractImageLabels

📄️ Check Image Safety

Robomotion.GoogleVision.CheckImageSafety

📄️ Image To Text

Robomotion.GoogleVision.ImageToText

📄️ Pdf To Text

Robomotion.GoogleVision.PdfToText

Overview​

Key Features​

Text Extraction (OCR)​

Image Understanding​

Document Processing​

Authentication Options​

Getting Started​

Basic Workflow​

Alternative Workflow (Direct Credentials)​

Common Use Cases​

Document Automation​

Content Moderation​

Image Organization​

Data Entry Automation​

Requirements​

Supported Image Formats​

Best Practices​

Image Quality​

Performance​

Error Handling​

Cost Optimization​

Available Nodes​