Upsert

Inserts new vectors or updates existing vectors in a Pinecone index. "Upsert" means the operation will insert new vectors or update existing ones if the ID already exists.

Common Properties

Name - The custom name of the node.
Color - The custom color of the node.
Delay Before (sec) - Waits in seconds before executing the node.
Delay After (sec) - Waits in seconds after executing node.
Continue On Error - Automation will continue regardless of any error. The default value is false.

info

If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.

Inputs

Connection Id - Connection identifier from Connect node (optional if API Key credential is provided directly).
Host URL - Your index's host URL. Format: https://INDEX_NAME-PROJECT_ID.svc.ENVIRONMENT.pinecone.io. You can find this in your Pinecone console under the index details.
Ids - Array of unique string identifiers for the vectors. Must have the same length as Values array.
Values - Array of vector arrays (2D array of floats). Each inner array must have the same dimension as the index.

Options

API Key - Pinecone API key credential (optional - use this instead of Connection Id if not using Connect node).
Metadata - Array of metadata objects corresponding to each vector. Each metadata object can contain custom fields for filtering and retrieval. Optional but highly recommended.
Name Space - Namespace to upsert vectors into. Namespaces allow you to partition vectors within a single index. Optional, defaults to the default namespace.

Output

Response - Object containing the upsert response with the count of successfully upserted vectors.
```
{
  "upsertedCount": 10
}
```

How It Works

The Upsert node adds or updates vectors in your Pinecone index. When executed, the node:

Validates required inputs (Host URL, Ids, Values)
Combines Ids, Values, and optional Metadata into vector objects
Constructs the upsert request with namespace if specified
Sends a POST request to the index's /vectors/upsert endpoint
Returns the number of successfully upserted vectors

Requirements

An existing Pinecone index in ready status
Vector embeddings generated from your data (using OpenAI, Cohere, or other embedding models)
IDs must be unique strings
Vector dimensions must match the index dimension exactly

Error Handling

The node will return specific errors in the following cases:

ErrInvalidArg - Host URL is empty
ErrInvalidArg - Ids or Values array is empty or null
ErrInvalidArg - Invalid Connection ID or missing API key
ErrInternal - Response format is not valid
ErrStatus - HTTP error from Pinecone API (dimension mismatch, invalid data, etc.)

Usage Notes

Each ID must be unique within the namespace
If an ID already exists, the vector and metadata are completely replaced
All vectors in the Values array must have the same dimension
The Ids and Values arrays must have the same length
Metadata arrays (if provided) must also match the length of Ids/Values
Pinecone has limits on batch size (typically 100-1000 vectors per request)
For large datasets, split into multiple Upsert operations

Best Practices

Always include meaningful metadata for filtering (e.g., category, date, source)
Use descriptive IDs that you can reference later (e.g., doc_123, product_abc)
Batch vectors in groups of 100-200 for optimal performance
Use namespaces to separate different data types or tenants
Normalize vectors before upserting when using dotproduct metric
Store original text or image references in metadata for retrieval

Metadata Best Practices

Metadata is crucial for effective vector search:

Keep metadata fields simple and queryable
Use consistent field names across all vectors
Common metadata fields:
- text - Original text content
- source - Document or source identifier
- category - Classification or type
- date - Timestamp or date
- author - Creator or owner
- url - Link to original content

Example: Upserting Document Embeddings

Input Preparation (JavaScript)

// Assume you have documents and their embeddings
const documents = [
  { id: "doc_1", text: "Introduction to AI", category: "education" },
  { id: "doc_2", text: "Machine Learning Basics", category: "education" },
  { id: "doc_3", text: "Deep Learning Guide", category: "advanced" }
];

// Get embeddings from OpenAI or another model
const embeddings = [
  [0.1, 0.2, 0.3, ...], // 1536 dimensions for ada-002
  [0.4, 0.5, 0.6, ...],
  [0.7, 0.8, 0.9, ...]
];

// Prepare for Upsert node
const ids = documents.map(d => d.id);
const values = embeddings;
const metadatas = documents.map(d => ({
  text: d.text,
  category: d.category,
  indexed_at: new Date().toISOString()
}));

// Set variables for Upsert node
msg.ids = ids;
msg.values = values;
msg.metadatas = metadatas;

Upsert Node Configuration

Connection Id: {{connection_id}}
Host URL: https://my-index-abc123.svc.us-east-1-aws.pinecone.io
Ids: {{ids}}
Values: {{values}}
Metadata: {{metadatas}}
Name Space: documents

Example: Upserting Product Embeddings

Input Preparation

const products = [
  {
    id: "prod_001",
    name: "Wireless Headphones",
    price: 99.99,
    category: "electronics"
  },
  {
    id: "prod_002",
    name: "Running Shoes",
    price: 79.99,
    category: "sports"
  }
];

// Generate embeddings from product descriptions
const embeddings = await generateEmbeddings(products.map(p => p.name));

msg.ids = products.map(p => p.id);
msg.values = embeddings;
msg.metadatas = products.map(p => ({
  name: p.name,
  price: p.price,
  category: p.category,
  in_stock: true
}));

Example: Using Namespaces

// Separate customer data by organization
msg.namespace = "org_abc123";
msg.ids = ["cust_1", "cust_2"];
msg.values = [[0.1, 0.2, ...], [0.3, 0.4, ...]];
msg.metadatas = [
  { name: "John Doe", tier: "premium" },
  { name: "Jane Smith", tier: "basic" }
];

Batch Processing Large Datasets

const BATCH_SIZE = 100;

for (let i = 0; i < allVectors.length; i += BATCH_SIZE) {
  const batch = allVectors.slice(i, i + BATCH_SIZE);

  msg.ids = batch.map(v => v.id);
  msg.values = batch.map(v => v.embedding);
  msg.metadatas = batch.map(v => v.metadata);

  // Process batch with Upsert node
  // Add delay between batches to avoid rate limits
}

Troubleshooting

Error: "Dimension mismatch"

Verify your embeddings have the correct dimension
Check that all vectors in the Values array have the same length
Ensure the dimension matches your index configuration

Error: "Ids cannot be empty"

Ensure Ids array is populated before the Upsert node
Check that the variable reference is correct (e.g., {{ids}})

Error: "Arrays length mismatch"

Ids, Values, and Metadata arrays must all have the same length
Verify array construction in your preparation code

Upsert succeeds but vectors not found in queries

Check that you're querying the correct namespace
Wait a few seconds for index to update (eventual consistency)
Verify the vectors were actually upserted (check response count)

Rate limit errors

Reduce batch size
Add delays between consecutive upserts
Upgrade your Pinecone plan for higher throughput

Common Properties​

Inputs​

Options​

Output​

How It Works​

Requirements​

Error Handling​

Usage Notes​

Best Practices​

Metadata Best Practices​

Example: Upserting Document Embeddings​

Input Preparation (JavaScript)​

Upsert Node Configuration​

Example: Upserting Product Embeddings​

Input Preparation​

Example: Using Namespaces​

Batch Processing Large Datasets​

Troubleshooting​