Skip to main content

Distance Metrics

VectorDB supports three distance metrics. The metric is set per collection at creation time and cannot be changed.

Cosine Similarity (cosine)

Measures the angle between two vectors, ignoring magnitude. Returns a score between -1 and 1, where 1 means identical direction.

Best for: Text embeddings, semantic search, most NLP use cases.

Most embedding models (OpenAI, sentence-transformers) are trained to work with cosine similarity. VectorDB automatically L2-normalizes vectors at upsert time when using cosine, so the inner product of normalized vectors equals the cosine similarity.

col = client.collections.create("articles", dim=384, distance_metric="cosine")

Euclidean Distance (l2)

Measures the straight-line distance between two points in vector space. Lower distance = more similar.

Best for: Image embeddings, coordinates, any embedding where magnitude carries meaning.

col = client.collections.create("images", dim=512, distance_metric="l2")
note

Search results are returned as similarity scores (higher = more similar), even for L2. VectorDB converts the distance to a score internally.

Inner Product (ip)

Computes the dot product of two vectors. Higher = more similar.

Best for: Recommendation systems, collaborative filtering embeddings trained with inner product loss (e.g., some matrix factorization models).

col = client.collections.create("recommendations", dim=128, distance_metric="ip")

Which Should I Use?

Use CaseRecommended Metric
Semantic text searchcosine
OpenAI embeddingscosine
sentence-transformerscosine
Image similarityl2
Geographic coordinatesl2
Recommendation systems (trained with IP)ip
Not surecosine

When in doubt, use cosine. It's the default and works well for the vast majority of embedding models.