Google Gemini
Google Gemini is Google's most capable and advanced family of large language models (LLMs), offering multimodal capabilities for text generation, image understanding, video generation, audio processing, and advanced reasoning tasks.
Overview
The Robomotion Google Gemini package provides comprehensive integration with Google's Gemini API, enabling you to:
- Generate high-quality text content with advanced reasoning
- Conduct multi-turn chat conversations with context retention
- Process and understand images, documents, and multimedia files
- Generate images with Imagen models (Nano Banana)
- Create videos with Veo models (with native audio support)
- Generate semantic embeddings for similarity search and classification
- Edit images using mask-based inpainting
- Access Google Search and code execution capabilities
Key Features
Text Generation
- Multiple Models: Choose from Gemini 3 Pro, Gemini 2.5 Pro/Flash, and Gemini 2.0 Flash variants
- Thinking Mode: Control reasoning depth with dynamic, budget-based, or level-based thinking
- Multimodal Input: Include images, documents, audio, and video files alongside text prompts
- Structured Output: JSON mode with schema validation for reliable data extraction
- Tools: Enable Google Search grounding and code execution
- Safety Controls: Granular content filtering across multiple categories
Chat Conversations
- Stateful Conversations: Maintain context across multiple messages
- History Management: Full control over conversation history
- File Attachments: Include multimedia files in chat messages
- Streaming Support: Real-time response generation
File Management
- Upload: Upload large files (up to 2GB) for use in prompts
- List & Retrieve: Manage your uploaded files with pagination
- Metadata: Track file state, expiration, and processing status
- Automatic Cleanup: Files expire automatically after 48 hours
Image Generation
- Multiple Models: Nano Banana (2.5 Flash) and Nano Banana Pro (3 Pro Preview)
- Aspect Ratios: Support for 1:1, 16:9, 9:16, 4:3, 3:4, and more
- Reference Images: Use up to 14 reference images for style consistency
- Google Search: Ground image generation with web search results
- Multiple Outputs: Generate up to 4 variations in a single request
Video Generation
- Veo Models: Veo 3.1 (with audio), Veo 3.1 Fast, and Veo 3.0
- Native Audio: Automatic audio generation for Veo 3+ models
- Image-to-Video: Transform still images into dynamic videos
- Frame Control: Specify first and last frames for precise control
- Flexible Duration: 4, 6, or 8 second videos
- HD Quality: 720p or 1080p resolution options
Embeddings
- Multiple Task Types: Optimized for retrieval, classification, clustering, and more
- Batch Processing: Generate embeddings for multiple texts efficiently
- Semantic Comparison: Built-in similarity calculation with multiple metrics
- Dimension Reduction: Custom output dimensionality (256, 512, 768)
- File-Based Workflows: Load/save embeddings for offline processing
Authentication Options
The package supports two authentication methods:
- Direct API Key: Use your own Google AI Studio API key
- Robomotion AI Credits: Pay-per-use billing through Robomotion's managed service (no API key required)
Getting Started
- Connect: Establish a connection using your API key or Robomotion credits
- Generate: Use any generation node (text, image, video) with your prompts
- Process Results: Access generated content through output variables
- Disconnect: Close the connection when done (optional, automatic cleanup on flow end)
Common Use Cases
Document Processing
- Extract structured data from invoices, receipts, and forms
- Summarize long documents and reports
- Translate documents while preserving formatting
- Answer questions about uploaded PDFs and images
Content Creation
- Generate marketing copy and product descriptions
- Create social media posts and captions
- Write code documentation and technical guides
- Produce image assets for presentations and marketing
Data Analysis
- Classify and categorize text data
- Perform sentiment analysis on customer feedback
- Find similar items using semantic embeddings
- Generate insights from business data
Automation
- Build AI-powered chatbots and assistants
- Automate customer support responses
- Process and route incoming communications
- Generate reports and summaries on schedule
Model Selection Guide
Text Generation
- Gemini 3 Pro: Best quality, advanced reasoning, highest cost
- Gemini 2.5 Pro: Balanced performance and cost
- Gemini 2.5 Flash: Fast, cost-effective for most tasks
- Gemini 2.5 Flash Lite: Ultra-fast, lowest cost for simple tasks
- Gemini 2.0 Flash: Legacy model with good performance
Image Generation
- Nano Banana Pro (3 Pro Preview): Best quality, supports up to 14 reference images
- Nano Banana (2.5 Flash): Fast, good quality for most use cases
Video Generation
- Veo 3.1: Best quality with native audio support
- Veo 3.1 Fast: Faster generation, good quality
- Veo 3.0: Legacy model with audio support
Embeddings
- text-embedding-004: Latest model, best performance
- text-embedding-005: Alternative option
- text-multilingual-embedding-002: For multilingual content
Available Nodes
📄️ Batch Embeddings
Robomotion.GoogleGemini.Embeddings.BatchEmbeddings
📄️ Compare Embeddings
Robomotion.GoogleGemini.Embeddings.CompareEmbeddings
📄️ Delete File
Robomotion.GoogleGemini.Files.DeleteFile
📄️ Edit Image
Robomotion.GoogleGemini.Images.EditImage
📄️ Embeddings
Robomotion.GoogleGemini.Embeddings.Embeddings
📄️ Upload File
Robomotion.GoogleGemini.Files.FileUpload
📄️ Generate Content
Robomotion.GoogleGemini.Content.GenerateContent
📄️ Generate Images
Robomotion.GoogleGemini.Images.GenerateImages
📄️ Generate Text
Robomotion.GoogleGemini.Content.GenerateText
📄️ Generate Videos
Robomotion.GoogleGemini.Videos.GenerateVideos
📄️ Get File
Robomotion.GoogleGemini.Files.GetFile
📄️ List Files
Robomotion.GoogleGemini.Files.ListFiles
📄️ List Models
Robomotion.GoogleGemini.Models.ListModels
📄️ Send Chat Message
Robomotion.GoogleGemini.Chat.SendChatMessage
📄️ Connect
Robomotion.GoogleGemini.Connect
📄️ Disconnect
Robomotion.GoogleGemini.Disconnect