Analyze Document
Analyzes documents using Claude AI with Beta Files API support, enabling Claude to read and understand various document formats including PDFs, Word documents, and more.
Common Properties
- Name - The custom name of the node.
- Color - The custom color of the node.
- Delay Before (sec) - Waits in seconds before executing the node.
- Delay After (sec) - Waits in seconds after executing node.
- Continue On Error - Automation will continue regardless of any error. The default value is false.
If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.
Inputs
- Connection Id - The Claude client session identifier from Connect node (optional if API Key is provided directly).
- System Prompt - System instructions to guide Claude's behavior when analyzing documents.
- User Prompt - Question or instruction about what you want to know about the document(s). For example: "Summarize this document" or "Extract all dates mentioned".
- File Paths - Paths to document files (PDF, Word, etc.) you want Claude to analyze. You can add multiple files.
- Custom File Paths - Array of file paths from message scope (e.g., msg.file_paths) for batch processing of documents.
Options
Authentication
- API Key - Claude API key (optional if using Connection ID).
- Use Robomotion AI Credits - Not supported for document analysis. This feature requires direct API mode with your own API key.
Model Selection
- Model - Select which Claude model to use. Options include:
- Claude Opus 4.5 - Best for complex document analysis
- Claude Opus 4 - Highly capable for detailed analysis
- Claude Sonnet 4.5 - Latest balanced model
- Claude Sonnet 4 - Balanced performance and speed (default)
- Claude 3.7 Sonnet - Latest 3.x generation Sonnet
- Claude 3.5 Sonnet - Previous generation Sonnet
- Claude 3.5 Haiku - Fastest model for simple documents
- Custom Model - Specify your own model name
- Custom Model - Enter custom model name when Custom Model is selected.
Generation Settings
- Max Tokens - Maximum tokens for the analysis response (default: 4096). Increase for longer, more detailed analyses.
- Temperature - Controls randomness (0.0-1.0). Use lower values (0.1-0.3) for factual document analysis.
- Top P - Nucleus sampling parameter (0.0-1.0).
- Top K - Top-k sampling parameter (1-100).
Extended Thinking
- Thinking Mode - Extended thinking allows Claude to reason through complex documents:
- Off - No extended thinking (default)
- Auto (Budget: 10240) - Automatic thinking with default token budget
- Custom Budget - Specify your own thinking token budget
- Thinking Budget - Custom thinking token budget (1024-128000). Only used when Thinking Mode is Custom.
Advanced
- Timeout (seconds) - Request timeout in seconds (default: 300 for file upload). Increase for large documents.
- Include Raw Response - Include full API response in output (default: false).
- Keep File After Analysis - Whether to keep the uploaded file on Claude's servers after analysis (default: false).
Outputs
- Text - Claude's analysis of the document(s).
- Thinking - Extended thinking output when thinking mode is enabled.
- File IDs - Uploaded file IDs that can be reused in subsequent requests (useful for multiple analyses of the same document).
- Raw Response - Complete API response object (when Include Raw Response is enabled).
How It Works
The Analyze Document node uses Claude's Beta Files API to analyze documents. When executed, the node:
- Validates the connection (requires direct API mode, not Robomotion credits)
- Collects all file paths from both individual and array inputs
- For each file:
- Opens and reads the file
- Detects the MIME type based on file extension
- Uploads the file to Claude's Files API
- Receives a file ID for the uploaded file
- Creates a message with the user prompt and uploaded documents
- Sends the request to Claude's Beta Messages API
- Extracts Claude's analysis from the response
- Returns the analysis, file IDs, and optional thinking output
- Optionally cleans up uploaded files (Note: Claude automatically cleans up files after some time)
Requirements
- Direct API mode - Robomotion AI Credits are not supported for document analysis
- Valid Connection Id or direct API Key credentials
- At least one file path must be provided
- Non-empty User Prompt
- Supported document formats (based on file extension)
Supported File Formats
The node automatically detects the MIME type based on file extension. Supported formats include:
Documents:
- PDF (.pdf)
- Text files (.txt, .md, .csv)
- HTML/Web (.html, .css, .js, .ts)
- Code files (.py, .json, .xml)
Images:
- PNG (.png), JPEG (.jpg, .jpeg), WebP (.webp), GIF (.gif)
- HEIC (.heic), HEIF (.heif)
- BMP (.bmp), TIFF (.tiff), SVG (.svg)
Audio:
- WAV (.wav), MP3 (.mp3), AIFF (.aiff), AAC (.aac)
- OGG (.ogg), FLAC (.flac)
Video:
- MP4 (.mp4), MPEG (.mpeg, .mpg), MOV (.mov)
- AVI (.avi), MKV (.mkv), WebM (.webm)
- WMV (.wmv), 3GP (.3gp)
Error Handling
The node will return specific errors in the following cases:
- Using Robomotion AI Credits mode - "File upload is not supported with Robomotion AI Credits"
- Empty or missing User Prompt
- No file paths provided
- Invalid Connection Id
- Failed to open or read file
- File upload failure
- Empty Custom Model name when Custom Model is selected
- Temperature out of range (must be 0.0-1.0)
- Top P out of range (must be 0.0-1.0)
- Top K less than 1
- Thinking budget out of range (must be 1024-128000)
- API authentication errors (401)
- API rate limit errors (429)
- API service errors (500, 503)
Usage Notes
File Upload
- Files are uploaded to Claude's secure servers before analysis
- File IDs can be reused for multiple analyses of the same document
- Files are automatically cleaned up by Claude after some time
- Large files may require longer timeout values
Document Analysis
- Be specific in your User Prompt about what you want to extract or analyze
- Claude can handle multiple documents in a single request
- For multi-document analysis, Claude can compare and cross-reference content
Model Selection
- Use Opus models for complex, detailed document analysis
- Use Sonnet for balanced performance on most documents
- Use Haiku for quick analysis of simple documents
Extended Thinking
- Enable for complex document analysis requiring reasoning
- Useful for legal documents, technical papers, or financial reports
- The thinking output shows Claude's analytical process
Examples
Example 1: PDF Invoice Analysis
Inputs:
- Connection Id: (from Connect node)
- User Prompt: "Extract the invoice number, date, total amount, and all line items from this invoice"
- File Paths: ["/path/to/invoice.pdf"]
- System Prompt: "You are a document processing assistant. Extract information accurately and format it clearly."
Configuration:
- Model: Claude Sonnet 4
- Max Tokens: 2000
- Temperature: 0.1 (for accuracy)
Output: Claude will extract all requested information from the PDF invoice in a structured format.
Example 2: Contract Review with Extended Thinking
Inputs:
- Connection Id: (from Connect node)
- User Prompt: "Review this contract and identify any unusual clauses, potential risks, or missing standard terms"
- File Paths: ["/path/to/contract.pdf"]
- System Prompt: "You are a legal document analyst. Be thorough and highlight important details."
Configuration:
- Model: Claude Opus 4.5
- Max Tokens: 4096
- Thinking Mode: Auto
- Temperature: 0.2
Outputs:
- Text: Detailed analysis of the contract
- Thinking: Claude's reasoning process for identifying risks
Example 3: Batch Document Processing
Inputs:
- Connection Id: (from Connect node)
- User Prompt: "Summarize each document and identify the main topics discussed"
- Custom File Paths: msg.document_paths (array containing multiple file paths)
Configuration:
- Model: Claude Sonnet 4
- Max Tokens: 3000
Output: Claude will analyze all documents and provide summaries and topic identification for each.
Example 4: Resume Screening
Inputs:
- Connection Id: (from Connect node)
- User Prompt: "Extract: candidate name, years of experience, technical skills, education, and previous employers. Rate the candidate's fit for a senior software engineer role."
- File Paths: ["/path/to/resume.pdf"]
Configuration:
- Model: Claude Sonnet 4
- Max Tokens: 1500
- Temperature: 0.3
Output: Structured extraction of resume information with a qualification assessment.
Example 5: Multi-Document Comparison
Inputs:
- Connection Id: (from Connect node)
- User Prompt: "Compare these three versions of the policy document and identify all changes between them"
- File Paths: ["/path/to/policy_v1.pdf", "/path/to/policy_v2.pdf", "/path/to/policy_v3.pdf"]
Configuration:
- Model: Claude Opus 4.5
- Max Tokens: 5000
- Temperature: 0.2
Output: Detailed comparison showing changes across all versions.
Example 6: Financial Report Analysis
Inputs:
- Connection Id: (from Connect node)
- User Prompt: "Analyze this financial report and provide: revenue trends, expense breakdown, profit margins, and key financial ratios. Highlight any concerning patterns."
- File Paths: ["/path/to/financial_report.pdf"]
- System Prompt: "You are a financial analyst. Provide quantitative analysis with specific numbers."
Configuration:
- Model: Claude Opus 4
- Max Tokens: 4000
- Thinking Mode: Auto
- Temperature: 0.2
Output: Comprehensive financial analysis with insights and concerns.
Example 7: Technical Documentation Understanding
Inputs:
- Connection Id: (from Connect node)
- User Prompt: "Create a step-by-step setup guide based on this technical documentation. Simplify complex terms for beginners."
- File Paths: ["/path/to/technical_docs.pdf"]
Configuration:
- Model: Claude Sonnet 4
- Max Tokens: 3000
- Temperature: 0.4
Output: Beginner-friendly setup guide derived from technical documentation.
Example 8: Reusing File IDs
First Analysis:
- File Paths: ["/path/to/large_document.pdf"]
- Keep File After Analysis: true
Output:
- File IDs: ["file-xyz123"]
Subsequent Analyses: You can reference the uploaded file by its ID in future requests without re-uploading, saving time and bandwidth.
Best Practices
-
File Management:
- Keep file sizes reasonable to avoid timeout issues
- Use appropriate timeout values for large documents
- Consider whether to keep files for reuse or let them be auto-cleaned
-
Prompt Engineering:
- Be specific about what information you want extracted
- Provide context about the document type
- Ask for structured output when needed (JSON, tables, lists)
-
Model Selection:
- Use Opus for complex documents requiring deep understanding
- Use Sonnet for general document analysis
- Use Haiku for simple text extraction
-
Accuracy:
- Use low temperature (0.1-0.3) for factual extraction
- Use extended thinking for complex analytical tasks
- Provide clear System Prompts for consistent results
-
Batch Processing:
- Use Custom File Paths for processing multiple documents
- Consider processing files in batches to avoid timeouts
- Handle errors gracefully when processing many files
-
Performance:
- Increase timeout for large or complex documents
- Monitor token usage with Max Tokens setting
- Reuse file IDs when analyzing the same document multiple times
-
Error Handling:
- Validate file paths before processing
- Check file formats are supported
- Implement retry logic for transient failures
-
Security:
- Remember that files are uploaded to Claude's servers
- Don't upload highly sensitive documents unless necessary
- Files are automatically cleaned up after some time
-
Quality:
- Ensure document quality is good (clear scans, readable PDFs)
- For scanned documents, consider OCR preprocessing if needed
- Test with sample documents before batch processing
-
Cost Management:
- Be mindful of token usage with large documents
- Use appropriate models for the task complexity
- Consider document size when setting Max Tokens