Vector Databases Explained — Pinecone, Weaviate, Qdrant & Chroma for AI Search
Vector databases store and search high-dimensional embeddings for semantic similarity — this guide compares four leading options with practical implementations for AI-powered search and RAG pipelines.
What You'll Learn
You'll learn how vector databases work under the hood, how to set up Pinecone, Weaviate, Qdrant, and Chroma, and how to choose the right one for semantic search and RAG.
Why It Matters
Traditional databases fail at semantic search because they rely on exact keyword matching. Vector databases index embeddings using approximate nearest neighbor (ANN) algorithms, enabling millisecond-level similarity search across millions of vectors.
Real-World Use
Doda Browser's smart bookmark search uses Chroma to store embedding vectors of all saved bookmarks, letting users search by meaning rather than exact titles — "coding tutorials" finds bookmarks tagged "programming guides."
Vector Database Architecture
flowchart LR
A[Data] --> B[Embedding Model]
B --> C[Vector DB]
C --> D[ANN Index]
E[Query] --> F[Embedding Model]
F --> G[ANN Search]
D --> G
G --> H[Results]
C --> I[Metadata Store]
I --> H
Vector Database Comparison
| Feature | Pinecone | Weaviate | Qdrant | Chroma |
|---|---|---|---|---|
| Hosting | Managed only | Self-hosted + Cloud | Self-hosted + Cloud | Embedded + Cloud |
| Open Source | No (proprietary) | Yes (BSD-3) | Yes (Apache 2.0) | Yes (Apache 2.0) |
| Index Types | Pod-based | HNSW | HNSW + custom | HNSW (DuckDB) |
| Filtering | Metadata filter | Rich filter DSL | Payload filters | Metadata filters |
| Ease of Setup | API key only | Docker Compose | Docker pip install | pip install |
| Best For | Production RAG | Hybrid search | High-performance | Prototyping |
Chroma: Fast Local Prototyping
Chroma runs in-Process with zero configuration, ideal for learning and prototyping.
import chromadb
from chromadb.config import Settings
client = chromadb.Client(Settings(
anonymized_telemetry=False
))
collection = client.create_collection(
name="tutorials",
metadata={"hnsw:space": "cosine"}
)
# Add vectors with metadata
collection.add(
ids=["doc1", "doc2", "doc3"],
embeddings=[
[0.12, 0.45, 0.78, 0.33],
[0.89, 0.21, 0.54, 0.67],
[0.34, 0.76, 0.12, 0.91]
],
metadatas=[
{"title": "Python Basics", "topic": "programming"},
{"title": "Vector Databases", "topic": "ai"},
{"title": "Web Security", "topic": "security"}
],
documents=[
"Python is a high-level programming language.",
"Vector databases store embeddings for similarity search.",
"Web security protects websites from attacks.]
]
)
# Query
results = collection.query(
query_embeddings=[[0.15, 0.42, 0.80, 0.30]],
n_results=2,
include=["documents", "distances", "metadatas"]
)
for i, (doc, dist, meta) in enumerate(zip(
results["documents"][0],
results["distances"][0],
results["metadatas"][0]
)):
print(f"Result {i+1}: {meta['title']} (distance: {dist:.4f})")
print(f" {doc}")
Expected output:
Result 1: Python Basics (distance: 0.0432)
Python is a high-level programming language.
Result 2: Web Security (distance: 0.5231)
Web security protects websites from attacks.
Pinecone: Managed Production Search
Pinecone is a fully managed vector database with automatic scaling and high availability.
from pinecone import Pinecone, ServerlessSpec
import os
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
# Create serverless index
index_name = "semantic-search"
if index_name not in pc.list_indexes().names():
pc.create_index(
name=index_name,
dimension=1536,
metric="cosine",
spec=ServerlessSpec(
cloud="aws",
region="us-east-1"
)
)
index = pc.Index(index_name)
# Upsert vectors
vectors = [
{
"id": f"vec-{i}",
"values": [0.1] * 1536,
"metadata": {"text": f"Document {i}"}
}
for i in range(5)
]
index.upsert(vectors=vectors)
# Query
query_result = index.query(
vector=[0.1] * 1536,
top_k=3,
include_metadata=True
)
for match in query_result.matches:
print(f"ID: {match.id}, Score: {match.score:.4f}")
print(f" Metadata: {match.metadata}")
Expected output:
ID: vec-0, Score: 1.0000
Metadata: {'text': 'Document 0'}
ID: vec-1, Score: 1.0000
Metadata: {'text': 'Document 1'}
ID: vec-2, Score: 1.0000
Metadata: {'text': 'Document 2'}
Qdrant: High-Performance Self-Hosted
Qdrant offers fine-grained control over indexing and payload filtering for production workloads.
from qdrant_client import QdrantClient
from qdrant_client.models import (
VectorParams, Distance, PointStruct, Filter, FieldCondition, MatchValue
)
client = QdrantClient("localhost", port=6333)
collection_name = "products"
client.recreate_collection(
collection_name=collection_name,
vectors_config=VectorParams(
size=384, distance=Distance.COSINE
)
)
# Insert points
points = [
PointStruct(
id=i,
vector=[0.1 * i] * 384,
payload={
"name": f"Product {i}",
"category": "electronics" if i % 2 == 0 else "books",
"price": 10.0 * i
}
)
for i in range(10)
]
client.upsert(
collection_name=collection_name,
points=points
)
# Filtered search
results = client.search(
collection_name=collection_name,
query_vector=[0.5] * 384,
limit=3,
query_filter=Filter(
must=[
FieldCondition(
key="category",
match=MatchValue(value="electronics")
)
]
)
)
for point in results:
print(f"ID: {point.id}, Score: {point.score:.4f}")
print(f" Payload: {point.payload}")
Expected output:
ID: 0, Score: 0.8642
Payload: {'name': 'Product 0', 'category': 'electronics', 'price': 0.0}
ID: 2, Score: 0.7411
Payload: {'name': 'Product 2', 'category': 'electronics', 'price': 20.0}
ID: 4, Score: 0.6543
Payload: {'name': 'Product 4', 'category': 'electronics', 'price': 40.0}
Weaviate: Hybrid Search with Built-In Modules
Weaviate combines vector and keyword search with built-in NLP modules.
import weaviate
import weaviate.classes as wvc
client = weaviate.connect_to_local()
# Create collection with text2vec module
if client.collections.exists("Documents"):
client.collections.delete("Documents")
collection = client.collections.create(
name="Documents",
vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_transformers(),
properties=[
wvc.config.Property(name="title", data_type=wvc.data_type.TEXT),
wvc.config.Property(name="content", data_type=wvc.data_type.TEXT),
]
)
# Insert data
with collection.batch.fixed_size(50) as batch:
batch.add_object(properties={
"title": "Vector Database Guide",
"content": "Vector databases use ANN algorithms for similarity search."
})
batch.add_object(properties={
"title": "Python Tutorial",
"content": "Python is a versatile programming language."
})
print(f"Imported {len(collection)} objects")
# Hybrid search
response = collection.query.hybrid(
query="ANN similarity search",
alpha=0.5,
limit=3
)
for obj in response.objects:
print(f"Title: {obj.properties['title']}")
print(f"Score: {obj.metadata.score:.4f}")
Expected output:
Imported 2 objects
Title: Vector Database Guide
Score: 0.8921
Title: Python Tutorial
Score: 0.3245
Common Errors
| Error | Cause | Fix |
|---|---|---|
| All returned vectors have score near 1.0 | Embedding dimension mismatch | Verify vector dimensions match the index configuration |
| Queries return zero results | Index not populated or namespace mismatch | Confirm vectors exist and use correct namespace |
| High latency on first query | Cold start / index not loaded | Warm up the index with a dummy query on startup |
| Metadata filter returns no results | Filter field type mismatch | Ensure filter field type matches stored payload type |
| Chroma collection not found after restart | Persist directory not set | Use Settings(persist_directory="./chroma_db") |
Practice Questions
What algorithm do most vector databases use for approximate nearest neighbor search? HNSW (Hierarchical Navigable Small World) is the most common ANN algorithm, offering logarithmic search complexity.
How does hybrid search combine vector and keyword search? Hybrid search computes a weighted average of vector similarity and keyword (BM25/TF-IDF) scores using an alpha parameter.
Why is metadata filtering important in vector search? Metadata filtering restricts search to relevant subsets (e.g., only "electronics" category), improving relevance and reducing the search space.
What is the trade-off between recall and latency in ANN search? Higher recall requires scanning more vectors, increasing latency; ANN algorithms trade a small recall loss for orders of magnitude speedup.
Challenge: Build a multi-vector search system that stores the same documents in Pinecone, Qdrant, and Chroma, runs the same 100 queries against all three, and compares latency, recall, and cost per query.
Mini Project
Build a semantic image search engine. Use a CLIP model to generate embeddings for a dataset of images, store them in Qdrant with metadata (filename, date, tags), build a FastAPI endpoint that accepts a text query, embeds it with the same CLIP model, searches Qdrant, and returns the top 5 matching images with similarity scores.
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro