Embeddings & Vector Databases

Embeddings are the bridge between human language and machine math. They turn words, sentences, and even images into dense numerical vectors that capture semantic meaning.

What Are Embeddings?

An embedding is a list of floating‑point numbers—typically 384 to 4096 dimensions—that represents a piece of text. The magic: semantically similar texts have embeddings that are close together in vector space. "King" minus "man" plus "woman" ≈ "queen"—this is vector arithmetic in action.

How Embeddings Are Created

Modern embedding models (like OpenAI's text-embedding-3, Cohere Embed, or open‑source models like BGE and E5) are trained on vast corpora to map text to vectors. The training objective ensures that related concepts end up near each other.

Cosine Similarity & Search

To compare embeddings, we measure cosine similarity—the cosine of the angle between two vectors. Values near 1 mean highly similar; near -1 mean opposite. This is the foundation of semantic search: find the document whose embedding is closest to the query embedding.

Vector Databases

Storing and querying millions of embeddings requires purpose‑built systems: Pinecone, Weaviate, Qdrant, Milvus, and pgvector (PostgreSQL extension). These databases use Approximate Nearest Neighbor (ANN) algorithms like HNSW to find similar vectors in milliseconds.

RAG: The Killer Application

Retrieval‑Augmented Generation (RAG) combines embeddings with LLMs: (1) embed user query, (2) find relevant documents via vector search, (3) inject retrieved text into the LLM prompt as context. This gives LLMs access to private, up‑to‑date knowledge without fine‑tuning.

What Are Embeddings?

How Embeddings Are Created

Cosine Similarity & Search

Vector Databases

RAG: The Killer Application

Learn More with AI