jthomas.site// notebook · v.4.2026
Machine Learning, Visualized · Vol. XXV

A Geometry of
Meaning

Words become vectors. Similar meanings cluster; analogies become arithmetic. The single trick that turns text into something a neural network can multiply.

The concept

An embedding assigns each word (or token, or item) a vector in some high-dimensional space. The geometry encodes meaning.

The famous result from Word2Vec (2013): vector arithmetic captures semantics. king − man + woman ≈ queen. Paris − France + Italy ≈ Rome. The directions in embedding space correspond to abstract relationships — gender, tense, country/capital — that nobody explicitly programmed.

Modern embeddings (BERT, OpenAI embeddings, Cohere embeddings) live in 768- to 4096-dimensional spaces. The toy below shows a hand-tuned 2D version where you can see analogies as actual arrows.

Why ML cares

Embeddings are how text — discrete, symbolic — becomes geometric, continuous, and differentiable. Every transformer's first layer is an embedding lookup. Every retrieval system (RAG, semantic search) compares query and document embeddings by cosine similarity.

Beyond text, the same idea generalizes: face embeddings (FaceNet), product embeddings (Amazon's recommender), molecule embeddings (drug discovery), node embeddings (graph ML). Wherever you have items + a notion of "similar," embeddings are the common currency.

Try this
  1. Pick the king − man + woman preset. The faint arrow from man → king is parallel to the bright arrow from woman → queen. That parallelism is the "gender" direction in embedding space — a literal vector you can add to any other word.
  2. Pick the Paris − France + Italy preset. Same parallelogram shape, different content — country→capital is its own direction. Same trick generalizes to verb tense, comparatives, plurals.
  3. Open Build your own · A − B + C. Click three words on the canvas to set A, B, C. The result vector snaps to the closest word — try it on any cluster.
  4. Hover any word to see its 5 nearest neighbors (by cosine similarity). Words with related meanings cluster; the geometric closeness is the semantic similarity.
Where you've seen this04 examples
Retrieval-augmented generation (RAG)

Every "chat with your PDF" tool embeds chunks of the document, embeds the query, and retrieves the chunks closest to the query in embedding space. The LLM then answers using the retrieved chunks as context.

Semantic search

Google, Bing, and most modern search engines now use embedding-based retrieval alongside keyword matching. A query for "fix my car" matches docs about "auto repair" because the vectors are close.

Face recognition

FaceNet embeds each face into a 128-d vector such that vectors of the same person are close, different people far. Phone unlock, photo album grouping, and surveillance all use this geometry.

Amazon & Spotify recommendations

Product embeddings (item2vec) and song embeddings (sequence-based) live in spaces where "things bought together" are close. Recommendation = nearest-neighbor lookup in embedding space.

Further reading