Distance Metrics

Euclidean, Manhattan, and why the choice matters for retrieval

Similarity measures how alike two vectors are. Distance measures how far apart they are. Both serve the same purpose—ranking candidates—but distance operates in reverse: lower is better.

Understanding the geometry of different distance metrics helps you choose the right one and interpret results correctly.

Euclidean Distance (L2)

Euclidean distance is the straight-line distance between two points—what you would measure with a ruler.

$d(\vec{a}, \vec{b}) = \sqrt{\sum_{i=1}^{n} (a_i - b_i)^2}$

In 2D, this is the Pythagorean theorem. In higher dimensions, the same formula applies: sum the squared differences, take the square root.

Interactive: Euclidean distance

B.x: 3.03.0

B.y: 2.02.0

d = √[(x₂-x₁)² + (y₂-y₁)²]

d = √[(3.0-0)² + (2.0-0)²] =3.606

Properties:

Always non-negative
Zero only when vectors are identical
Symmetric: d(a, b) = d(b, a)
Satisfies triangle inequality: d(a, c) ≤ d(a, b) + d(b, c)

For normalized vectors: Euclidean distance relates directly to cosine similarity: $d(\vec{a}, \vec{b}) = \sqrt{2 - 2\cos(\theta)}$

This means ranking by Euclidean distance on normalized vectors gives the same order as cosine similarity.

Manhattan Distance (L1)

Manhattan distance sums the absolute differences per dimension—the distance if you could only travel along axis-aligned paths (like a taxi in Manhattan).

$d(\vec{a}, \vec{b}) = \sum_{i=1}^{n} |a_i - b_i|$

Interactive: Manhattan distance

B.x: 3.03.0

B.y: 2.02.0

Manhattan (L1)

5.00

|Δx| + |Δy|

Euclidean (L2)

3.61

√(Δx² + Δy²)

Purple path: Manhattan distance (along axes). Orange dashed: Euclidean (straight line).

Properties:

Also called L1 distance, taxicab distance, or city block distance
More robust to outliers than Euclidean
Treats all dimensions equally regardless of correlation

Manhattan distance is less commonly used in semantic search but has applications when individual feature differences matter independently.

Lp Distances

Euclidean and Manhattan are special cases of the Lp norm:

$d_p(\vec{a}, \vec{b}) = \left(\sum_{i=1}^{n} |a_i - b_i|^p\right)^{1/p}$

L1 (p=1): Manhattan distance
L2 (p=2): Euclidean distance
L∞ (p=∞): Maximum difference across dimensions

Higher p emphasizes the largest differences. Lower p treats all differences more equally.

Comparing Metrics

Interactive: Compare distance metrics

Point	L2	L1	L∞
A	2.24	3.00	2.00
B	2.24	3.00	2.00
C	2.55	3.00	2.50

L2 Rank

1.A2.B3.C

L1 Rank

1.A2.B3.C

L∞ Rank

1.A2.B3.C

Consider two points in 2D:

Point A: (0, 0)
Point B: (3, 4)

Metric	Distance
Euclidean (L2)	5
Manhattan (L1)	7
L∞ (Chebyshev)	4

Different metrics give different distances—but more importantly, they can give different rankings when comparing multiple candidates.

Squared Euclidean Distance

Computing square roots is expensive. Since we only need rankings, we often use squared Euclidean distance:

$d^2(\vec{a}, \vec{b}) = \sum_{i=1}^{n} (a_i - b_i)^2$

This preserves ranking order (if d(a) < d(b), then d²(a) < d²(b)) while avoiding the square root computation.

Most vector databases use squared Euclidean distance internally, even if they report it as "Euclidean" in the API.

Distance and Similarity Relationship

Converting between distance and similarity

Cosine Similarity: 0.800.8

Similarity (higher = more similar)

Cosine0.800

Dot Product0.800

Distance (lower = more similar)

Cosine Dist0.200

Euclidean (norm)0.632

Conversion Formulas (normalized vectors)

cosine_dist = 1 - cosine_sim

euclidean = √(2 - 2 × cosine_sim)

For normalized vectors, there are clean conversions:

Similarity	Distance	Relationship
Cosine sim	Cosine dist	dist = 1 - sim
Dot product	-	Same as cosine for normalized
-	Euclidean	dist² = 2(1 - cosine)

When your algorithm requires distances but you want cosine-like behavior, use Euclidean distance on normalized vectors.

Which Metric to Choose?

For semantic search: Use cosine similarity (or equivalently, Euclidean on normalized vectors). This is what embedding models are trained for.

For image embeddings: Often L2 distance, but check model documentation.

For sparse vectors: L1 or cosine, depending on the embedding type.

For user/item embeddings: Dot product often works, as magnitude may carry meaning.

The choice depends on how the embeddings were trained. Most text embedding models optimize for cosine similarity, so use that metric.

Metric Spaces and Indexing

The choice of metric affects which indexing algorithms apply:

HNSW works with any metric satisfying the triangle inequality (L1, L2, cosine).

IVF (Inverted File) uses any metric for clustering and search.

LSH (Locality Sensitive Hashing) requires metric-specific hash functions. Random projection LSH assumes L2 or cosine.

Vector databases typically support multiple metrics. Choose at index creation time—changing later requires rebuilding.

Key Takeaways

Euclidean (L2) distance measures straight-line distance; it is sensitive to large differences
Manhattan (L1) distance sums absolute differences; it is more robust to outliers
Squared Euclidean distance preserves rankings while avoiding expensive square roots
For normalized vectors, Euclidean distance and cosine similarity give the same ranking
Choose the metric your embedding model was trained for—typically cosine for text
Metric choice affects which indexing algorithms can be used