Skip to main content

Google Vertex AI

Google Vertex AI is Google Cloud's unified platform for building, deploying, and scaling machine learning models and AI applications, offering powerful APIs for text generation, chat, code generation, and embeddings.

Overview

The Robomotion Google Vertex AI package provides comprehensive integration with Google Cloud's Vertex AI platform, enabling you to:

  • Generate high-quality text content using PaLM 2 models
  • Build conversational AI applications with chat models
  • Generate code snippets and solutions with code-specialized models
  • Create semantic embeddings for text similarity and search
  • Process multimodal content combining text and images

Key Features

Text Generation

  • PaLM 2 Models: Access to Text Bison variants for text generation
  • Customizable Parameters: Control temperature, top-k, top-p for fine-tuned outputs
  • Token Control: Set maximum output tokens and stop sequences
  • Multiple Candidates: Generate multiple response variations in one request

Chat Conversations

  • Context-Aware: Maintain conversation context with system instructions
  • Few-Shot Learning: Provide examples to guide model behavior
  • Chat Bison Models: Specialized models optimized for conversational AI
  • Flexible Parameters: Full control over generation settings

Code Generation

  • Code-Specialized Models: Code Bison models trained on source code
  • Multi-Language Support: Generate code in various programming languages
  • Smart Completion: Context-aware code completion and generation
  • Documentation: Automatic code documentation and explanation

Text Embeddings

  • Semantic Vectors: Generate embeddings for text similarity and search
  • Gecko Models: Multiple embedding model variants available
  • Multilingual Support: Gecko Multilingual for cross-language tasks
  • Auto-Truncation: Automatic text truncation to fit token limits

Multimodal Embeddings

  • Text and Image: Generate embeddings for text, images, or both
  • Unified Vector Space: Combined embeddings for cross-modal search
  • Image Understanding: Process images alongside text for rich context
  • Flexible Input: Support for text-only or image-only embeddings

Authentication Options

The package supports two authentication methods:

  1. Connection ID: Use the Connect node to establish a reusable connection
  2. Direct Credentials: Provide Google Cloud service account credentials directly

Getting Started

  1. Connect: Use the Connect node with your Google Cloud credentials and Project ID
  2. Use Connection ID: Pass the connection ID to other nodes
  3. Execute Operations: Use any generation or embedding node
  4. Disconnect: Optionally disconnect when done

Using Direct Credentials

  1. Configure Node: Add credentials and project ID directly to each node
  2. Execute: Run the node without requiring a Connect node
  3. Flexibility: Useful for one-off operations or testing

Common Use Cases

Content Generation

  • Generate product descriptions and marketing copy
  • Create blog posts and articles
  • Write email templates and responses
  • Produce creative content variations

Conversational AI

  • Build customer support chatbots
  • Create virtual assistants
  • Develop interactive Q&A systems
  • Implement multi-turn dialogue systems

Code Automation

  • Generate boilerplate code
  • Create code documentation
  • Build code completion systems
  • Generate test cases and examples
  • Build document search systems
  • Implement recommendation engines
  • Create content similarity matching
  • Develop semantic clustering solutions

Document Processing

  • Extract information from documents
  • Classify and categorize text
  • Find similar documents
  • Generate document summaries

Model Selection Guide

Text Generation

  • text-bison@001: Stable version for production
  • text-bison: Latest version with improvements
  • text-bison-32k: Extended context window (32k tokens)

Chat

  • chat-bison@001: Stable version for production
  • chat-bison: Latest version with improvements
  • chat-bison-32k: Extended context window for longer conversations

Code Generation

  • code-bison: Latest code generation model

Embeddings

  • textembedding-gecko@001: Stable embedding model
  • textembedding-gecko@latest: Latest embedding improvements
  • textembedding-gecko-multilingual@latest: Multilingual embeddings

Multimodal

  • multimodalembedding@001: Text and image embeddings

Configuration

Regional Endpoints

  • Default location: us-central1
  • Other regions available based on your Google Cloud project setup
  • Choose regions closer to your users for lower latency

Model Publisher

  • Default publisher: google
  • Typically remains unchanged unless using custom models

Custom Models

All generation nodes support custom model names for:

  • Fine-tuned models
  • Custom deployed models
  • Experimental model versions

Best Practices

Performance

  • Reuse connection IDs across multiple operations
  • Batch embedding operations when possible
  • Choose appropriate regions for your workload
  • Cache responses for repeated queries

Cost Optimization

  • Use stable model versions (@001) for predictable costs
  • Set appropriate max token limits
  • Use embeddings for similarity instead of full generation
  • Monitor usage through Google Cloud Console

Quality

  • Adjust temperature based on use case (0 for deterministic, higher for creative)
  • Provide clear, specific prompts
  • Use examples in chat for better responses
  • Test with candidate count > 1 to compare outputs

Security

  • Store credentials in Robomotion vault
  • Use service accounts with minimal required permissions
  • Rotate API credentials regularly
  • Monitor API usage for anomalies

Available Nodes