What is an embedding?
An embedding is a vector of numbers that encodes the meaning of a text. Two sentences with similar meaning yield nearby vectors, even with no words in common — enabling semantic search rather than keyword search.
Measuring closeness
Similarity is usually the cosine between vectors: near 1 means similar meaning, near 0 means unrelated.
Scaling up
Comparing the query to every document is costly at scale. ANN indexes (like HNSW) find near-optimal neighbors in logarithmic time. In production, hybrid search (vectors + lexical BM25) is the safe default.