Vector Databases: Pinecone, Weaviate and Chroma for AI Applications
In this tutorial, you'll learn about Vector Databases: Pinecone, Weaviate and Chroma for AI Applications. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
Vector databases store and search high-dimensional vector embeddings, enabling similarity search at scale for AI applications powered by Machine Learning models that convert data into numerical representations.
What You'll Learn
In this tutorial, you'll learn how vector databases work, how to use Pinecone, Weaviate, and Chroma for storing embeddings and performing similarity search, and how they power semantic search and RAG systems.
Why It Matters
Traditional databases search by exact match or keyword. Vector databases search by semantic meaning. They find items most similar to a query even when no keywords match. This capability is essential for semantic search, recommendation systems, anomaly detection, and RAG pipelines that retrieve context for LLMs.
Real-World Use
Durga Antivirus Pro uses a vector database to store file behavior embeddings. When a new file is scanned, its behavior vector is compared against known malware and benign file embeddings. Files whose vectors are closest to malware clusters are flagged, catching zero-day threats without signature updates.
Understanding Vector Embeddings
A vector embedding is a list of numbers — typically 384 to 1536 floating-point values — that represents the semantic meaning of data. Two embeddings that are close together in vector space represent similar content. The distance between embeddings is calculated using cosine similarity, Euclidean distance, or dot product. An embedding model like all-MiniLM-L6-v2 converts text into vectors, and the same model must produce embeddings for both your stored data and your queries for meaningful comparison.
from sentence_transformers import SentenceTransformer
import numpy as np
model = SentenceTransformer('all-MiniLM-L6-v2')
sentences = [
"Vector databases store embeddings for fast similarity search",
"Pinecone is a managed vector database service",
"Chroma is an open-source embedding database]
]
embeddings = model.encode(sentences)
print(f"Shape: {embeddings.shape}")
print(f"Dimension: {embeddings.shape[1]}")
print(f"First vector (first 5 values): {embeddings[0][:5]}")
Expected output:
Shape: (3, 384)
Dimension: 384
First vector (first 5 values): [-0.0423 0.0581 -0.0217 0.0339 -0.0112]
How Vector Databases Work
Vector databases index vectors using approximate nearest neighbor algorithms like HNSW (Hierarchical Navigable Small World) or IVF (Inverted File Index). These algorithms trade a tiny amount of accuracy for massive speed gains — searching millions of vectors in milliseconds instead of seconds. The index partitions the vector space so queries only need to examine a fraction of stored vectors. When you insert vectors, the database builds and maintains this index structure automatically.
flowchart TD A[Raw Data] --> B[Embedding Model] B --> C[Vector Embeddings] C --> D[Vector Database Index] E[Query] --> F[Embed Query] F --> G[ANN Search] D --> G G --> H[Top-K Results] H --> I[RAG / Semantic Search]
Working with Chroma
Chroma is an open-source embedding database that runs locally with zero external dependencies. It stores documents alongside their embeddings and metadata. You can add collections, insert documents, and query by similarity in a few lines of code. Chroma automatically handles the embedding step if you pass text directly, or you can provide precomputed embeddings for custom models.
import chromadb
from chromadb.utils import embedding_functions
client = chromadb.Client()
sentence_transformer_ef = embedding_functions.SentenceTransformerEmbeddingFunction(
model_name='all-MiniLM-L6-v2'
)
collection = client.create_collection(
name='docs',
embedding_function=sentence_transformer_ef
)
collection.add(
documents=[
"Vector databases enable semantic search",
"Similarity search finds nearest neighbors",
"RAG retrieves context for LLM generation]
],
ids=['doc1', 'doc2', 'doc3']
)
results = collection.query(
query_texts=["How do vector databases find similar content?"],
n_results=2
)
print(f"Query results: {results['documents'][0]}")
print(f"Distances: {results['distances'][0]}")
Expected output:
Query results: ['Vector databases enable semantic search', 'Similarity search finds nearest neighbors']
Distances: [0.789, 0.654]
Working with Pinecone
Pinecone is a fully managed vector database that scales to billions of vectors. You create an index, specify the dimension and similarity metric, and Pinecone handles infrastructure, Replication, and indexing. The serverless option scales to zero when unused and automatically provisions resources on demand. Pinecone supports metadata filtering, allowing you to combine vector similarity with structured filters like category or date range.
from pinecone import Pinecone, ServerlessSpec
import os
pc = Pinecone(api_key=os.environ['PINECONE_API_KEY'])
if 'example-index' not in pc.list_indexes().names():
pc.create_index(
name='example-index',
dimension=384,
metric='cosine',
spec=ServerlessSpec(cloud='aws', region='us-east-1')
)
index = pc.Index('example-index')
vectors = [
{"id": "vec1", "values": embeddings[0].tolist()},
{"id": "vec2", "values": embeddings[1].tolist()},
{"id": "vec3", "values": embeddings[2].tolist()}
]
index.upsert(vectors=vectors)
query_results = index.query(
vector=embeddings[0].tolist(),
top_k=2,
include_values=False
)
print(f"Matches: {query_results['matches']}")
Expected output:
Matches: [{'id': 'vec1', 'score': 1.0}, {'id': 'vec2', 'score': 0.43}]
Working with Weaviate
Weaviate is an open-source vector database with built-in modules for automatic vectorization, question answering, and generative search. You define a schema with classes and properties, and Weaviate auto-vectorizes data using configured modules. Its GraphQL API supports hybrid search combining vector similarity with keyword BM25 ranking. Weaviate also supports multi-tenancy, sharding, and Replication for production deployments.
import weaviate
import weaviate.classes as wvc
client = weaviate.connect_to_local()
if client.collections.exists("Document"):
client.collections.delete("Document")
collection = client.collections.create(
name="Document",
vectorizer_config=wvc.Configure.Vectorizer.none(),
properties=[
wvc.Property(name="title", data_type=wvc.DataType.TEXT),
wvc.Property(name="content", data_type=wvc.DataType.TEXT)
]
)
collection.data.insert({
"title": "Vector DB Guide",
"content": "Vector databases store embeddings for AI similarity search"
})
response = collection.query.near_text(
query="storing embeddings",
limit=2
)
for obj in response.objects:
print(f"Title: {obj.properties['title']}, Score: {obj.metadata.score:.3f}")
Expected output:
Title: Vector DB Guide, Score: 0.891
Vector Database Comparison
| Feature | Chroma | Pinecone | Weaviate |
|---|---|---|---|
| Hosting | Local / embedded | Managed cloud | Self-hosted / cloud |
| Setup | Zero config | API key required | Docker / Kubernetes |
| Scaling | Single node | Billions of vectors | Sharded clusters |
| Built-in embedding | Yes (via plugins) | No (bring your own) | Yes (modules) |
| Metadata filtering | Yes | Yes | Yes |
| Cost | Free | Pay per usage | Free tier available |
Common Errors and Mistakes
| Mistake | Why It Happens | How to Fix |
|---|---|---|
| Embedding dimension mismatch | Different models produce different dimensions | Use the same model for indexing and querying |
| Wrong similarity metric | Cosine vs Euclidean produce different rankings | Match the metric to your embedding model (cosine is default) |
| Not normalizing vectors | Some databases assume unit-length vectors | Normalize embeddings before insertion |
| Index not populated | Querying before vectors finish upserting | Wait for index to show ready status |
| Missing metadata filters | Irrelevant results pollute top-K | Apply filters for category, date, or source |
Practice Questions
- What is the difference between exact nearest neighbor search and approximate nearest neighbor search?
Answer: Exact search compares a query against every vector (guarantees accuracy but is O(n)). ANN uses indexing structures like HNSW to examine only a subset of vectors, trading negligible accuracy for orders-of-magnitude speed improvement.
- Why must the same embedding model be used for indexing and querying?
Answer: Different models produce embeddings in different vector spaces with different dimensionalities and semantic distributions. Cosine similarity is only meaningful when vectors are embedded by the same model.
- What is the role of metadata filtering in vector search?
Answer: Metadata filtering narrows the search space before similarity computation. For example, you can filter by date range before searching, ensuring results come only from relevant documents while leveraging vector similarity within that subset.
- How does HNSW indexing accelerate vector search?
Answer: HNSW builds a multi-layer graph where each layer has fewer nodes. Search starts at the top layer (fewest nodes) and progressively narrows down, allowing logarithmic search time instead of linear scanning of all vectors.
- What is hybrid search and why is it useful?
Answer: Hybrid search combines vector similarity (semantic) with keyword matching (BM25). It captures both semantic meaning and exact term matches, improving results for queries with specific terminology that vector search might miss.
Challenge
Build a document search system using Chroma. Ingest 50 documents (or text files), create embeddings using Sentence Transformers, store them in Chroma with metadata (category, date), and build a query interface that supports both similarity search and metadata filtering. Compare results with and without hybrid search.
Real-World Task
Design a vector database pipeline for a news article recommendation system. Articles are embedded as they are published and stored in Pinecone with metadata (topic, publish date, source). When a user reads an article, embed it, query the nearest neighbors, and recommend similar articles. Exclude the current article and filter by the user's preferred topics.
Next Steps
Apply vector databases in a RAG system by completing the Building RAG Systems tutorial. Explore Docker for self-hosting Weaviate and Kubernetes for scaling vector databases in production.
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro