Vector Databases
A database built for “what’s most similar?”
Section titled “A database built for “what’s most similar?””“A vector database is just a database that’s really fast at the question: ‘What’s most similar to this?’ Regular databases can’t do that. Vector databases are built for it.”
In the last chapter, you turned your text chunks into
You might be thinking: “I already have a database. Can’t I just use Postgres?” Good instinct, but no. Here’s why.
Why regular SQL databases fail at similarity search
Section titled “Why regular SQL databases fail at similarity search”Imagine you have a million rows in a SQL table. Each row has a column called embedding containing 768 numbers. A user asks a question, and you need to find the 5 rows whose embeddings are most similar to the question’s embedding.
In SQL, you’d have to do something like this: calculate the WHERE embedding SIMILAR TO X clause in SQL. There’s no index that helps. It’s a full table scan, every time. [src: qdrant_docs]
For 100 documents, that’s fine. For 100,000 documents, it takes seconds. For 10 million documents, it’s unusable.
A
How vector databases actually work: the HNSW highway
Section titled “How vector databases actually work: the HNSW highway”The most popular indexing algorithm in vector databases is called
Think of it like a multi-level highway system.
You’re trying to drive from your house to a specific coffee shop across the country. You don’t check every single street in the country — that would take forever. Instead:
- You start on the interstate (the top level). You zoom across the country in a few hops, getting close to the right region. This level has very few “exits” — it’s fast but approximate.
- You exit onto a state highway (the middle level). Now you’re navigating within the right area, with more options and more precision.
- You take local streets (the bottom level). Now you’re checking individual locations, and you find your coffee shop.
HNSW works the same way with vectors. It builds multiple layers of connections between data points. The top layers have long-range connections for fast, approximate navigation. The bottom layers have short-range connections for precise, local search. The result: instead of checking a million vectors, you check maybe a few hundred — and still find the nearest neighbours with over 95% accuracy. [src: malkov2018hnsw]
The trade-off is clear: HNSW gives you approximate nearest neighbours, not perfect ones. But for RAG, “the 5 most relevant chunks” doesn’t need to be mathematically perfect — it needs to be fast and good enough. And HNSW delivers both.
The free vector database landscape
Section titled “The free vector database landscape”You don’t need to pay anything to get started. Here are the four most popular free options, each with a different sweet spot.
ChromaDB — the easiest starting point
Section titled “ChromaDB — the easiest starting point”ChromaDB is an open-source vector database that runs locally on your machine with a single pip install. It stores data in an embedded database (SQLite + DuckDB under the hood), so there’s no server to set up. You write Python code, and it just works. [src: chromadb_docs]
Best for:
- ✅ Learning and prototyping
- ✅ Small projects (under ~100k documents)
Limitation:
- ⚠️ Not ideal for production-scale multi-user deployments
Qdrant — production-grade with a free tier
Section titled “Qdrant — production-grade with a free tier”Qdrant is a purpose-built vector database written in Rust for performance. It offers a generous free cloud tier (1GB storage, which is a lot of vectors), supports hybrid search out of the box, and has excellent filtering capabilities. [src: qdrant_docs]
Best for:
- ✅ Projects likely to grow into production
- ✅ Teams that want managed cloud options
Limitation:
- ⚠️ Slightly more setup than ChromaDB for local-only use
FAISS — Facebook’s speed demon
Section titled “FAISS — Facebook’s speed demon”FAISS (Facebook AI Similarity Search) is a library, not a database. It doesn’t have a server or an API — it’s a set of C++ functions (with Python bindings) that index and search vectors extremely fast. It’s what you use when you need raw speed and you’re comfortable managing storage yourself. [src: faiss_docs]
Best for:
- ✅ Maximum local performance
- ✅ Research and large-scale batch processing
Limitation:
- ⚠️ No built-in server, persistence workflow, or filtering
- ⚠️ Requires more custom implementation
Weaviate — the ecosystem player
Section titled “Weaviate — the ecosystem player”Weaviate is a full-featured vector database with a free cloud tier, built-in vectorisation (it can call embedding models for you), and a GraphQL API. It has the richest feature set of the four but also the steepest learning curve. [src: weaviate_docs]
Best for:
- ✅ Teams wanting an all-in-one platform
- ✅ Built-in model integration workflows
Limitation:
- ⚠️ More complexity than needed for learning or small projects
Comparison at a glance
Section titled “Comparison at a glance”| Feature | ChromaDB | Qdrant | FAISS | Weaviate |
|---|---|---|---|---|
| Setup | pip install | Docker or cloud | pip install | Docker or cloud |
| Runs locally | Yes | Yes | Yes | Yes |
| Free cloud tier | No | Yes (1GB) | No (library only) | Yes (sandbox) |
| Hybrid search | No | Yes | No | Yes |
| Production-ready | Prototype-scale | Yes | Yes (with work) | Yes |
| Best for beginners | Yes | Second choice | No | No |
| Language | Python | Rust | C++/Python | Go |
Persistent vs. in-memory storage
Section titled “Persistent vs. in-memory storage”This is a small but important concept. When a vector database is in-memory, all your vectors live in RAM. The moment you stop your program, everything disappears. You’d have to re-embed all your documents next time.
When a vector database is persistent, it saves vectors to disk. You can stop your program, restart your computer, and your data is still there. ChromaDB is persistent by default — it writes to a local directory. FAISS is in-memory by default — you have to explicitly save and load the index file yourself.
For learning, either works. For anything you’d be annoyed to lose, use persistent storage.
Try it: pick your database
Section titled “Try it: pick your database”Not sure which database fits your project? Answer a few quick questions and get a recommendation.
Pick Your Database
What is this project for?
Where does the vector database sit in the pipeline?
Section titled “Where does the vector database sit in the pipeline?”Here’s the full RAG architecture with the vector database highlighted. Notice that it sits between the embedding step (where chunks become vectors) and the retrieval step (where a query finds the most relevant chunks).
The vector database is the persistent memory of your RAG system. Documents go in once (during ingestion). Queries hit it every time a user asks a question. That’s why speed matters — and why HNSW indexing is worth the complexity.
Project step: store your vectors
Section titled “Project step: store your vectors”Head to the Playground and store the embedded chunks from Chapter 3 in a local ChromaDB instance. Watch the storage indicator confirm your vectors are persisted. Try closing the Playground tab and reopening it — your data should still be there.
Quick check
Section titled “Quick check”Why can't a regular SQL database efficiently handle similarity search over vectors?
In the HNSW algorithm, what does the multi-layer structure achieve?
What You Just Built
Section titled “What You Just Built”In this chapter, you learned to:
- Understand why regular SQL databases can’t do similarity search
- Choose the right vector database for your use case (ChromaDB, Qdrant, FAISS, Weaviate)
- Explain how HNSW indexing makes search fast at scale
- Store your embedded chunks in a persistent vector database
Your chunks are now embedded, stored, and ready to be queried.
Next up: Chapter 5 — Retrieval Strategies. Finding “the most similar chunks” is just the starting point. Hybrid search, re-ranking, and diversity controls are what separate a toy demo from a real product.
Was this chapter helpful?
Sources
Section titled “Sources”- ChromaDB Documentation — docs.trychroma.com [src: chromadb_docs]
- Qdrant Documentation — qdrant.tech/documentation [src: qdrant_docs]
- FAISS Documentation — github.com/facebookresearch/faiss [src: faiss_docs]
- Weaviate Documentation — weaviate.io/developers/weaviate [src: weaviate_docs]
- Malkov & Yashunin (2018) — “Efficient and Robust Approximate Nearest Neighbor using Hierarchical Navigable Small World Graphs” [src: malkov2018hnsw]