Compare Embeddings
Compares embeddings of texts to find similarities using various similarity metrics.
Common Properties
- Name - The custom name of the node.
- Color - The custom color of the node.
- Delay Before (sec) - Waits in seconds before executing the node.
- Delay After (sec) - Waits in seconds after executing node.
- Continue On Error - Automation will continue regardless of any error. The default value is false.
info
If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.
Inputs
- Connection Id - The connection ID obtained from the Connect node.
- Source Text - Primary text to compare against others.
- Comparison Texts - Array of texts to compare with source.
- Source Embedding File - Path to JSON file with pre-computed source embedding (optional).
- Target Embeddings File - Path to JSON file with pre-computed target embeddings array (optional).
- Comparison Texts File - Path to text file with comparison texts (one per line, optional).
Options
- Embedding Model - The model to use for generating embeddings. Options include:
- Text Embedding 004
- Text Embedding 005
- Multilingual Embedding 002
- Custom Model
- Custom Model - Custom model name when "Custom Model" is selected for Embedding Model.
- Similarity Metric - Method for calculating similarity. Options include:
- Cosine Similarity
- Dot Product
- Euclidean Distance
- Similarity Threshold - Minimum similarity score (0.0-1.0, filters results).
- Max Results - Maximum number of results to return.
- Sort Order - Sort results by similarity score. Options are:
- Highest First
- Lowest First
- Timeout (seconds) - Request timeout in seconds (default: 60).
Output
- Similarities - The similarity results as structured data with scores and metadata.
- Results File Path - Path to the file containing detailed results in JSON format.
How It Works
The Compare Embeddings node calculates similarity scores between a source text and multiple comparison texts using Google's Gemini API. When executed, the node:
- Validates the provided connection ID and input texts or files
- Configures the embedding model based on the selected options
- Loads or generates embeddings for the source text and comparison texts
- Calculates similarity scores using the specified metric (cosine, dot product, or euclidean)
- Filters results based on the similarity threshold if provided
- Sorts results according to the specified sort order
- Limits the number of results if a maximum is specified
- Saves detailed results to a JSON file and returns a summary
Requirements
- A valid Google Gemini API key
- Connection ID from a successful Connect node execution
- Either source text or a source embedding file
- Either comparison texts or a target embeddings file
Error Handling
The node will return specific errors in the following cases:
- Empty or invalid Connection ID
- Missing source text or embedding file
- Missing comparison texts or embeddings file
- Invalid similarity threshold value (must be between 0.0 and 1.0)
- Invalid max results value (must be at least 1)
- Invalid custom model name
- File I/O errors when reading embedding files
- API errors from Google's Gemini service
Usage Notes
- You can provide texts directly or load pre-computed embeddings from JSON files
- Comparison texts can also be loaded from a text file (one text per line)
- Cosine similarity is the default and most commonly used metric
- Results are automatically saved to a JSON file in a temporary directory
- The node supports timeout configuration for long-running operations
- For large datasets, consider using pre-computed embedding files to improve performance