LEARNING

Embeddings

Created 2 May 2025
learningmachine-learningembeddingsvectorsrepresentation

Embeddings

Every time you search by meaning rather than exact keywords, embeddings are doing the work. They’re one of those ideas that sounds abstract until you see what they enable — then they’re everywhere.

An embedding turns something (a word, a sentence, an image, a concept) into a list of numbers — a point in space. Things that are similar end up close together. Things that are different end up far apart. That’s the whole idea.


The Intuition

Imagine you could place every English word on a map. “King” and “queen” would be near each other. “Dog” and “puppy” closer still. “Banana” would be way off in another direction.

Now imagine that map has 768 dimensions instead of two. That’s an embedding space.

"king"   → [0.2, -0.4, 0.7, 0.1, ..., -0.3]
"queen"  → [0.21, -0.38, 0.69, 0.12, ..., -0.28]   ← very close!
"banana" → [-0.5, 0.8, -0.1, 0.9, ..., 0.4]         ← very far

The beautiful thing: nobody designed which dimension means what. The network learned these representations from patterns in data. Meaning emerges from structure.


What Makes Them Powerful

Arithmetic with meaning. The famous example:

king - man + woman ≈ queen

Directions in embedding space encode relationships. The “gender” direction, the “royalty” direction — they emerge naturally.

Context changes everything. Modern embeddings (from Transformers) give the same word different vectors depending on context:

  • “bank” in “river bank” → nature, geography
  • “bank” in “bank account” → finance, money

This is what makes AI feel like it “understands.”


Where You’ll Encounter Them

Use caseHow embeddings help
RAG & RetrievalFind relevant documents by meaning, not just keywords
Semantic search“Show me articles about job loss” finds “unemployment” too
Recommendations“Similar items” in any domain
ClusteringAutomatically group related content
Inside every LLMTokens become embeddings as the first step inside a Transformer

If you’re building anything with AI that involves finding, matching, or comparing — you’re using embeddings.


The Practical Pipeline

This is how embeddings work in a real application (like RAG):

1. Take your documents → split into chunks
2. Run each chunk through an embedding model → get vectors
3. Store vectors in a vector database
4. User asks a question → embed the question too
5. Find the nearest vectors (most similar chunks)
6. Feed those chunks to an LLM as context → grounded answer

Embedding Models Worth Knowing

  • OpenAI text-embedding-3 — High quality, paid API
  • BGE / Nomic — Open-source, excellent performance
  • all-MiniLM — Fast, lightweight, good for getting started
  • Cohere Embed — Strong multilingual support

Vector Databases

  • Pinecone — Managed, zero-ops
  • Weaviate — Open-source, hybrid search (vectors + keywords)
  • ChromaDB — Simplest option, great for prototyping
  • pgvector — PostgreSQL extension (use your existing DB)

What I’m Still Learning

  • How to choose the right chunk size (too small = no context, too big = noise)
  • When hybrid search (vectors + BM25 keyword matching) outperforms pure vector search
  • How multimodal embeddings (CLIP, ImageBind) unify text and image in one space

Go Deeper

Best Resources

  • “Illustrated Word2Vec” (Jay Alammar) — Visual intuition for how embeddings form
  • Hugging Face MTEB Leaderboard — Compare embedding models by benchmarks
  • Pinecone Learning Centre — Practical guides for vector search
enes