How to Use This Calculator
Start by selecting your vector dimension—anywhere from 2 to 10 components. Enter each element of vector a and vector b in the corresponding input fields. If your vectors have fewer dimensions than selected, simply pad them with zeros. The calculator instantly displays:
- Cosine similarity (ranges from −1 to 1)
- Cosine distance (1 minus similarity)
- The angle between vectors in degrees
A detailed breakdown of intermediate calculations—dot products, magnitudes, and the final similarity—appears below the results so you can verify each step.
Cosine Similarity Formula
Cosine similarity between two N-dimensional vectors a and b is the dot product divided by the product of their magnitudes. This approach works when you don't know the angle directly.
S_C = (a · b) / (‖a‖ × ‖b‖)
where:
a · b = a₁b₁ + a₂b₂ + ... + aₙbₙ
‖a‖ = √(a₁² + a₂² + ... + aₙ²)
‖b‖ = √(b₁² + b₂² + ... + bₙ²)
S_C— Cosine similarity (ranges from −1 to 1)a · b— Dot product of vectors a and b‖a‖— Magnitude (Euclidean norm) of vector a‖b‖— Magnitude (Euclidean norm) of vector bn— Number of dimensions in both vectors
Understanding Cosine Similarity Values
The cosine similarity metric captures directional alignment without regard to scale. A value of 1 indicates vectors pointing in exactly the same direction. A value of 0 means orthogonal vectors (perpendicular at 90°). A value of −1 means vectors point in completely opposite directions.
This property makes cosine similarity invaluable in text analysis—two documents with identical word distributions but different lengths will have high similarity—and in recommendation engines, where direction in feature space matters more than magnitude.
Cosine Distance Explained
Cosine distance is simply the inverse of similarity:
D_C = 1 − S_C
It quantifies dissimilarity on a scale from 0 (identical) to 2 (opposite). However, cosine distance is not a true metric in the mathematical sense because it violates the triangle inequality—the path from vector a to c via vector b may exceed the direct distance from a to c. For clustering or distance-based algorithms, use Euclidean distance if a proper metric is required.
Key Considerations When Using Cosine Similarity
Avoid common pitfalls when interpreting or computing cosine similarity.
- Ignores magnitude entirely — Vectors [1, 1] and [10, 10] have identical cosine similarity (both point the same direction). If vector size matters to your analysis, supplement with magnitude checks or use Euclidean distance instead.
- Zero vectors cause division errors — A zero vector (all elements zero) has undefined magnitude, making cosine similarity undefined. Always validate input vectors are non-zero before computation.
- Negative values signal opposing directions — Don't assume negative cosine similarity means poor match. In domains like sentiment analysis or directional data, negative similarity is meaningful and expected when vectors point opposite ways.
- Scale vectors consistently for text — In NLP, normalizing term frequency vectors before computing cosine similarity prevents high-frequency words from dominating. Use TF-IDF or sublinear scaling for robust text comparisons.