Google Vertex AI
Google Vertex AI is Google Cloud's unified platform for building, deploying, and scaling machine learning models and AI applications, offering powerful APIs for text generation, chat, code generation, and embeddings.
Overview
The Robomotion Google Vertex AI package provides comprehensive integration with Google Cloud's Vertex AI platform, enabling you to:
- Generate high-quality text content using PaLM 2 models
- Build conversational AI applications with chat models
- Generate code snippets and solutions with code-specialized models
- Create semantic embeddings for text similarity and search
- Process multimodal content combining text and images
Key Features
Text Generation
- PaLM 2 Models: Access to Text Bison variants for text generation
- Customizable Parameters: Control temperature, top-k, top-p for fine-tuned outputs
- Token Control: Set maximum output tokens and stop sequences
- Multiple Candidates: Generate multiple response variations in one request
Chat Conversations
- Context-Aware: Maintain conversation context with system instructions
- Few-Shot Learning: Provide examples to guide model behavior
- Chat Bison Models: Specialized models optimized for conversational AI
- Flexible Parameters: Full control over generation settings
Code Generation
- Code-Specialized Models: Code Bison models trained on source code
- Multi-Language Support: Generate code in various programming languages
- Smart Completion: Context-aware code completion and generation
- Documentation: Automatic code documentation and explanation
Text Embeddings
- Semantic Vectors: Generate embeddings for text similarity and search
- Gecko Models: Multiple embedding model variants available
- Multilingual Support: Gecko Multilingual for cross-language tasks
- Auto-Truncation: Automatic text truncation to fit token limits
Multimodal Embeddings
- Text and Image: Generate embeddings for text, images, or both
- Unified Vector Space: Combined embeddings for cross-modal search
- Image Understanding: Process images alongside text for rich context
- Flexible Input: Support for text-only or image-only embeddings
Authentication Options
The package supports two authentication methods:
- Connection ID: Use the Connect node to establish a reusable connection
- Direct Credentials: Provide Google Cloud service account credentials directly
Getting Started
Using Connect Node (Recommended)
- Connect: Use the Connect node with your Google Cloud credentials and Project ID
- Use Connection ID: Pass the connection ID to other nodes
- Execute Operations: Use any generation or embedding node
- Disconnect: Optionally disconnect when done
Using Direct Credentials
- Configure Node: Add credentials and project ID directly to each node
- Execute: Run the node without requiring a Connect node
- Flexibility: Useful for one-off operations or testing
Common Use Cases
Content Generation
- Generate product descriptions and marketing copy
- Create blog posts and articles
- Write email templates and responses
- Produce creative content variations
Conversational AI
- Build customer support chatbots
- Create virtual assistants
- Develop interactive Q&A systems
- Implement multi-turn dialogue systems
Code Automation
- Generate boilerplate code
- Create code documentation
- Build code completion systems
- Generate test cases and examples
Semantic Search
- Build document search systems
- Implement recommendation engines
- Create content similarity matching
- Develop semantic clustering solutions
Document Processing
- Extract information from documents
- Classify and categorize text
- Find similar documents
- Generate document summaries
Model Selection Guide
Text Generation
- text-bison@001: Stable version for production
- text-bison: Latest version with improvements
- text-bison-32k: Extended context window (32k tokens)
Chat
- chat-bison@001: Stable version for production
- chat-bison: Latest version with improvements
- chat-bison-32k: Extended context window for longer conversations
Code Generation
- code-bison: Latest code generation model
Embeddings
- textembedding-gecko@001: Stable embedding model
- textembedding-gecko@latest: Latest embedding improvements
- textembedding-gecko-multilingual@latest: Multilingual embeddings
Multimodal
- multimodalembedding@001: Text and image embeddings
Configuration
Regional Endpoints
- Default location: us-central1
- Other regions available based on your Google Cloud project setup
- Choose regions closer to your users for lower latency
Model Publisher
- Default publisher: google
- Typically remains unchanged unless using custom models
Custom Models
All generation nodes support custom model names for:
- Fine-tuned models
- Custom deployed models
- Experimental model versions
Best Practices
Performance
- Reuse connection IDs across multiple operations
- Batch embedding operations when possible
- Choose appropriate regions for your workload
- Cache responses for repeated queries
Cost Optimization
- Use stable model versions (@001) for predictable costs
- Set appropriate max token limits
- Use embeddings for similarity instead of full generation
- Monitor usage through Google Cloud Console
Quality
- Adjust temperature based on use case (0 for deterministic, higher for creative)
- Provide clear, specific prompts
- Use examples in chat for better responses
- Test with candidate count > 1 to compare outputs
Security
- Store credentials in Robomotion vault
- Use service accounts with minimal required permissions
- Rotate API credentials regularly
- Monitor API usage for anomalies
Available Nodes
📄️ Generate Chat
Robomotion.VertexAI.GenerateChat
📄️ Generate Code
Robomotion.VertexAI.GenerateCode
📄️ Generate Text
Robomotion.VertexAI.GenerateText
📄️ Get Multimodal Embeddings
Robomotion.VertexAI.GetMultimodalEmbeddings
📄️ Get Text Embeddings
Robomotion.VertexAI.GetTextEmbeddings
📄️ Connect
Robomotion.VertexAI.Connect
📄️ Disconnect
Robomotion.VertexAI.Disconnect