An embedding is a list of numbers — a vector — that captures the meaning of a piece of data such as a sentence, a document, or an image, in a form a computer can compare. An embedding model reads the input and produces this vector so that items with similar meaning end up with similar numbers. That single property, similar meaning means nearby vectors, is what makes embeddings so useful. This explainer covers how they are made, how similarity works, and where they show up in real AI systems.
How an embedding is made
You pass raw input — say the sentence "the cat sat on the mat" — into an embedding model. The model outputs a fixed-length list of numbers, perhaps a few hundred or a few thousand of them. Each number is a coordinate; together they place the sentence at a point in a high-dimensional space.
You cannot read meaning from any single number. The meaning lives in the overall position. What matters is that the model was trained so related inputs land near one another and unrelated inputs land far apart.
Why nearby means similar
| Concept |
Plain meaning |
Why it matters |
| Vector |
List of numbers for one item |
The item position in space |
| Distance |
How far two vectors sit |
Smaller distance, closer meaning |
| Similarity |
Closeness score, often cosine |
Ranks results by relevance |
| Dimension |
Length of the vector |
More can capture finer nuance |
To find items related to a query, you embed the query, then look for the stored embeddings closest to it. "Car" and "automobile" land near each other even though they share no letters, which is why embeddings beat plain keyword search for meaning.
Where embeddings show up
- Semantic search ranks documents by meaning, not exact words.
- Recommendations find items similar to what you liked.
- Retrieval-augmented generation fetches relevant text to ground a chatbot answer.
- Clustering and deduplication group near-identical content.
Embeddings are the backbone of retrieval-augmented generation: the system embeds your question, retrieves the closest chunks of your data, and feeds them to the model. If you have explored RAG, you have already used embeddings under the hood.
Common misconceptions
- "Embeddings store the original text." They do not. You usually keep the source text separately and store the vector for comparison.
- "They match keywords." They match meaning. That is the point, and also why an embedding can miss an exact term a keyword search would catch.
- "All embeddings are interchangeable." Vectors from different models are not comparable. Embed everything with the same model.
- "More dimensions are always better." Bigger vectors can capture nuance but cost more to store and compare. The right size depends on the task.
Embeddings turn meaning into geometry. If you want the layer above them, see how tokens feed language models in the first place.
FAQ
What is an embedding in simple terms?
It is a list of numbers that represents the meaning of some data. Things that mean similar things get similar numbers, so you can compare meaning by comparing the numbers.
What are embeddings used for?
Mainly semantic search, recommendations, clustering, and retrieval-augmented generation. Anywhere you need to compare items by meaning rather than exact text, embeddings help.
How do you measure similarity between embeddings?
By distance in the vector space, most often cosine similarity. Closer vectors mean more similar meaning, which lets you rank results.
Can you embed images and audio, not just text?
Yes. There are embedding models for images, audio, and other data. Some shared models even place text and images in the same space so you can compare across types.
Where to go next
See how RAG uses embeddings to ground answers, learn what a token is, and understand what an AI model actually is.