Skip to main content

Pinecone

The Pinecone package enables you to integrate Pinecone's vector database into your RPA automation workflows. Pinecone is a fully-managed, developer-friendly vector database designed for AI applications, enabling semantic search, recommendation systems, and Retrieval-Augmented Generation (RAG) workflows.

Features

  • Create and manage vector database indexes
  • Insert and update high-dimensional vector embeddings
  • Perform similarity searches with filtering capabilities
  • Fetch vectors by ID or query by similarity
  • Delete vectors individually, by filter, or in bulk
  • Get detailed statistics about index usage and namespaces
  • Support for metadata filtering and namespaces for data organization

Prerequisites

Before using the Pinecone package, you need to:

  1. Create a Pinecone account at pinecone.io
  2. Generate an API key from the Pinecone console at https://app.pinecone.io/organizations/-/projects/-/keys
  3. Store your API key as a credential in Robomotion
  4. Note your Pinecone environment (e.g., us-east-1-aws, us-west1-gcp)

Common Use Cases

  • Retrieval-Augmented Generation (RAG) - Store document embeddings for AI chatbots and question-answering systems
  • Semantic Search - Build intelligent search engines that understand meaning, not just keywords
  • Recommendation Systems - Find similar products, content, or users based on vector similarity
  • Document Similarity - Compare and cluster documents based on their semantic content
  • Image Search - Store and search image embeddings for visual similarity
  • Anomaly Detection - Identify outliers by finding vectors that are dissimilar to the norm
  • Content Deduplication - Detect duplicate or near-duplicate content using similarity search

Getting Started

The typical workflow involves:

  1. Connect - Establish connection with your Pinecone API key
  2. Create Index - Set up a new index with appropriate dimensions and metrics
  3. Upsert - Insert vector embeddings with metadata
  4. Query - Search for similar vectors
  5. Fetch - Retrieve specific vectors by ID
  6. Delete - Remove vectors when no longer needed

Alternatively, you can provide the API key directly to each node without using the Connect node.

Understanding Vector Databases

Vector databases like Pinecone store high-dimensional vectors (arrays of numbers) that represent data in a mathematical space. When you use AI models like OpenAI's text-embedding-ada-002 or other embedding models, they convert text, images, or other data into vectors. These vectors capture semantic meaning, allowing you to:

  • Find similar items by calculating distance between vectors
  • Filter results using metadata (e.g., category, date, author)
  • Scale to billions of vectors with fast query performance

Index Configuration

When creating an index, you need to specify:

  • Dimension - Must match your embedding model's output (e.g., 1536 for OpenAI ada-002)
  • Metric - Distance function for similarity:
    • cosine - Cosine similarity (most common, good for normalized vectors)
    • euclidean - Euclidean distance (good for absolute differences)
    • dotproduct - Dot product (good for speed when vectors are normalized)
  • Pod Type - Performance tier (p1, p2, s1)
  • Pods & Replicas - Scale and availability settings

Namespaces

Namespaces allow you to partition vectors within a single index, useful for:

  • Multi-tenant applications (one namespace per user/organization)
  • Separating development, staging, and production data
  • Organizing vectors by category or data type