Similarity

Calculates similarity between text embeddings using cosine similarity and returns the most similar matches.

Common Properties

Name - The custom name of the node.
Color - The custom color of the node.
Delay Before (sec) - Waits in seconds before executing the node.
Delay After (sec) - Waits in seconds after executing node.
Continue On Error - Automation will continue regardless of any error. The default value is false.

info

If the ContinueOnError property is true, no error is caught when the project is executed, even if a Catch node is used.

Connection Id - The connection ID for the OpenAI service.
Search Embeddings CSV - File path to the CSV containing search embeddings.
Content Embeddings CSV - File path to the CSV containing content embeddings to compare against.

Similarity - The similarity results as a JSON object containing the most similar content and their similarity scores.

The Similarity node calculates cosine similarity between text embeddings to find the most similar matches:

Validates the provided Connection Id and file paths
Reads the search embedding from the first row of the search embeddings CSV
Reads all content embeddings from the content embeddings CSV
Calculates cosine similarity between the search embedding and each content embedding
Returns the top N matches (based on the Matches option) sorted by similarity score

The node will return specific errors in the following cases:

The node uses cosine similarity to measure similarity between embeddings
Both CSV files should contain an "embedding" column with JSON-formatted embeddings
The search embeddings CSV should contain at least one row with the query embedding
The content embeddings CSV should contain all embeddings to compare against
The default number of matches returned is 5
The output contains both the content text and similarity scores for each match
Higher similarity scores indicate more similar content