Skip to main content

Get Text Embeddings

Generates semantic vector embeddings for text using Google Vertex AI's Gecko embedding models.

Common Properties

  • Name - The custom name of the node.
  • Color - The custom color of the node.
  • Delay Before (sec) - Waits in seconds before executing the node.
  • Delay After (sec) - Waits in seconds after executing node.
  • Continue On Error - Automation will continue regardless of any error. The default value is false.
info

If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.

Inputs

  • Connection Id - Vertex AI client session identifier from Connect node (optional if credentials provided directly).
  • Credentials - Google Cloud service account credentials (optional if using Connection ID).
  • Project Id - Google Cloud Project ID (required if using direct credentials).
  • Text - Text content to generate embeddings for (limit: 5 texts of up to 3072 tokens each).

Options

Model Configuration

  • Auto Truncate - Automatically truncate text that exceeds token limit. Default is true.
  • Model - Vertex AI text embedding model to use:
    • textembedding-gecko@001 - Stable version for production use
    • textembedding-gecko@latest - Latest model with improvements
    • textembedding-gecko-multilingual@latest - Multilingual support
    • Custom Model - Specify your own model name
  • Custom Model - Custom model name when "Custom Model" is selected.

Endpoint Configuration

  • Locations - Google Cloud region for the Vertex AI endpoint. Default is "us-central1".
  • Publishers - Model publisher (typically "google"). Default is "google".

Output

  • Response - Full API response object containing text embedding vectors.

Response structure:

{
"predictions": [
{
"embeddings": {
"values": [0.123, -0.456, 0.789, ...],
"statistics": {
"truncated": false,
"token_count": 15
}
}
}
]
}

How It Works

The Get Text Embeddings node converts text into high-dimensional vector representations. When executed:

  1. Validates connection (either via Connection ID or direct credentials)
  2. Retrieves authentication token and project ID
  3. Validates that text input is not empty
  4. Configures the embedding model and endpoint
  5. Constructs request payload with text and auto-truncate setting
  6. Sends POST request to Vertex AI predict endpoint
  7. Processes response and extracts embedding vectors
  8. Returns complete response object with embeddings and metadata

The embeddings are numerical vectors (typically 768 dimensions for Gecko models) that capture semantic meaning, enabling similarity comparisons and machine learning tasks.

Requirements

  • Either:
    • Connection ID from Connect node, OR
    • Direct credentials + Project ID
  • Text input (non-empty, max 3072 tokens)
  • Vertex AI API enabled in Google Cloud project
  • IAM permissions: aiplatform.endpoints.predict

Error Handling

Common errors and solutions:

ErrorCauseSolution
ErrInvalidArgEmpty text inputProvide valid text content
ErrInvalidArgConnection ID or credentials missingUse Connect node or provide credentials
ErrInvalidArgEmpty model selectionSelect a valid embedding model
ErrNotFoundConnection not foundVerify Connection ID from Connect node
ErrStatusAPI error (quota, permissions)Check Google Cloud Console for API status
Token count exceededText too long without auto-truncateEnable Auto Truncate or reduce text length

Example Use Cases

Scenario: Find similar support tickets
1. Connect to Vertex AI
2. For each document:
- Get Text Embeddings
- Store embedding vector in database
3. For search query:
- Get Text Embeddings
- Calculate cosine similarity with stored embeddings
- Return top matches

Semantic Text Classification

Use Case: Categorize customer feedback
1. Generate embeddings for category examples:
- "Product quality issue" → embedding_1
- "Shipping delay" → embedding_2
- "Billing question" → embedding_3
2. For new feedback:
- Get Text Embeddings
- Find nearest category embedding
- Classify based on highest similarity

Content Deduplication

Process: Remove duplicate articles
1. Generate embeddings for all articles
2. Calculate pairwise similarity matrix
3. Identify articles with >95% similarity
4. Mark duplicates for review/removal
Configuration:
- Model: textembedding-gecko-multilingual@latest
- Text: User query in any language
- Use embeddings to match across languages
- Search documents in multiple languages

Batch Embedding Generation

Flow:
1. Connect (once)
2. Loop through items:
- Get Text Embeddings for each item
- Store result with item ID
3. Disconnect
Benefit: Efficient for large datasets

Tips

  • Auto Truncate: Enable for variable-length text to avoid errors
  • Model Selection:
    • Use @001 for stable, predictable embeddings
    • Use @latest for best quality
    • Use multilingual for cross-language tasks
  • Token Limits: Monitor token count in response statistics
  • Batch Processing: Reuse connection for multiple embedding requests
  • Similarity Metrics: Use cosine similarity for comparing embeddings
  • Caching: Cache embeddings for frequently accessed content
  • Normalization: Gecko embeddings are pre-normalized for cosine similarity
  • Dimensionality: Gecko models produce 768-dimensional vectors

Common Patterns

Cosine Similarity Calculation

// After getting embeddings for two texts
function cosineSimilarity(vec1, vec2) {
const dotProduct = vec1.reduce((sum, val, i) => sum + val * vec2[i], 0);
const mag1 = Math.sqrt(vec1.reduce((sum, val) => sum + val * val, 0));
const mag2 = Math.sqrt(vec2.reduce((sum, val) => sum + val * val, 0));
return dotProduct / (mag1 * mag2);
}

Vector Storage Best Practices

  • Store embeddings in vector databases (Pinecone, Weaviate, Milvus)
  • Use approximate nearest neighbor (ANN) algorithms for large datasets
  • Index embeddings for fast retrieval
  • Store metadata alongside embeddings for filtering

Performance Optimization

  • Connection Reuse: One connection for multiple embedding requests
  • Batch API Calls: Send up to 5 texts per request (not currently supported by this node)
  • Regional Endpoints: Use closest region to reduce latency
  • Caching Strategy: Cache embeddings for static content
  • Async Processing: Process embeddings in parallel when possible

Best Practices

  • Use consistent model versions in production for reproducibility
  • Store model version with embeddings for future reference
  • Validate text length before sending to API
  • Implement retry logic for transient API errors
  • Monitor token usage and costs in Google Cloud Console
  • Test with sample data before processing large datasets
  • Use embedding statistics to optimize text preprocessing
  • Document use cases and similarity thresholds for your application