A vector database stores embeddings — numeric representations of text, images, or audio — and finds the nearest matches to a query. It is the retrieval half of most AI applications, the part that grounds a model in your actual documents. The good news in 2026: you probably need a simpler option than the marketing implies, and the simplest one is already inside the database you use for everything else.
What changed in 2026
- pgvector got fast enough for production. With HNSW indexes and better query planning, Postgres handles millions of vectors comfortably, removing the need for a separate database for most apps.
- Hybrid search became the default. Pure vector search misses exact terms, names, and IDs; combining it with keyword (BM25) search is now standard in serious stacks.
- Managed services competed on cost, not just speed. Per-query and storage pricing dropped as the category matured.
- Metadata filtering matured. Filtering by tenant, date, or document type at query time — once a weak spot — is now solid across the leaders.
Ranked options
| Database |
Best for |
Hosting |
Standout |
| pgvector (Postgres) |
Most apps; teams already on Postgres |
Self or managed |
No new infra; one database for everything |
| Pinecone |
Zero-ops, large scale |
Fully managed |
Hands-off indexing and scaling |
| Qdrant |
Self-hosted with strong filtering |
Self or cloud |
Great filtering, hybrid search, predictable cost |
| Weaviate |
Built-in hybrid + modules |
Self or cloud |
Hybrid search and integrated vectorisation |
| Milvus |
Very large, high-throughput |
Self or cloud |
Billion-scale workloads |
How to choose
- Already on Postgres and under ~10M vectors? Use pgvector. You add an extension, not a system to operate.
- Want someone else to run it? Pinecone or a managed Qdrant/Weaviate cloud — pay for zero ops.
- Self-hosting with heavy metadata filtering? Qdrant is the comfortable default.
- Billion-scale or extreme throughput? Milvus is purpose-built for that tier.
Why hybrid search matters
Vector search excels at meaning ("documents about late payment") but stumbles on exact tokens (an invoice number, a product SKU, a function name). Keyword search is the opposite. Hybrid search runs both and fuses the results, which is why retrieval quality jumps when you add it. If a database makes hybrid search easy, weight that heavily — it affects answer quality more than raw latency.
The mistake to avoid
Choosing on a benchmark chart. Query speed differences rarely matter at app scale; what bites you in production is weak metadata filtering, no hybrid search, or operational overhead you did not budget for. Pick for the workflow, not the leaderboard.
FAQ
Do I need a vector database at all?
Not always. If your knowledge base is small, you can embed and search in memory or rely on a long context window. Add a vector database when data outgrows the prompt.
Is pgvector really enough for production?
For the majority of apps, yes. Teams move off it mainly for billion-scale data or when they want a fully managed service.
What embedding model should I pair with it?
Match the embedding dimension your model outputs and keep it consistent. Re-embedding everything later is painful, so choose deliberately.
RAG or fine-tuning?
They solve different problems. See our comparison below.
Where to go next
RAG vs fine-tuning in 2026, Best backend for AI apps in 2026, and Best AI API providers in 2026.