Vector Databases: Pinecone, Weaviate and Chroma for AI Applications

DodaTech Updated 2026-06-22 7 min read

In this tutorial, you'll learn about Vector Databases: Pinecone, Weaviate and Chroma for AI Applications. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

Vector databases store and search high-dimensional vector embeddings, enabling similarity search at scale for AI applications powered by Machine Learning models that convert data into numerical representations.

What You'll Learn

In this tutorial, you'll learn how vector databases work, how to use Pinecone, Weaviate, and Chroma for storing embeddings and performing similarity search, and how they power semantic search and RAG systems.

Why It Matters

Traditional databases search by exact match or keyword. Vector databases search by semantic meaning. They find items most similar to a query even when no keywords match. This capability is essential for semantic search, recommendation systems, anomaly detection, and RAG pipelines that retrieve context for LLMs.

Real-World Use

Durga Antivirus Pro uses a vector database to store file behavior embeddings. When a new file is scanned, its behavior vector is compared against known malware and benign file embeddings. Files whose vectors are closest to malware clusters are flagged, catching zero-day threats without signature updates.

Understanding Vector Embeddings

A vector embedding is a list of numbers — typically 384 to 1536 floating-point values — that represents the semantic meaning of data. Two embeddings that are close together in vector space represent similar content. The distance between embeddings is calculated using cosine similarity, Euclidean distance, or dot product. An embedding model like all-MiniLM-L6-v2 converts text into vectors, and the same model must produce embeddings for both your stored data and your queries for meaningful comparison.

from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer('all-MiniLM-L6-v2')

sentences = [
    "Vector databases store embeddings for fast similarity search",
    "Pinecone is a managed vector database service",
    "Chroma is an open-source embedding database]
]

embeddings = model.encode(sentences)
print(f"Shape: {embeddings.shape}")
print(f"Dimension: {embeddings.shape[1]}")
print(f"First vector (first 5 values): {embeddings[0][:5]}")

Expected output:

Shape: (3, 384)
Dimension: 384
First vector (first 5 values): [-0.0423  0.0581 -0.0217  0.0339 -0.0112]

How Vector Databases Work

Vector databases index vectors using approximate nearest neighbor algorithms like HNSW (Hierarchical Navigable Small World) or IVF (Inverted File Index). These algorithms trade a tiny amount of accuracy for massive speed gains — searching millions of vectors in milliseconds instead of seconds. The index partitions the vector space so queries only need to examine a fraction of stored vectors. When you insert vectors, the database builds and maintains this index structure automatically.

flowchart TD
  A[Raw Data] --> B[Embedding Model]
  B --> C[Vector Embeddings]
  C --> D[Vector Database Index]
  E[Query] --> F[Embed Query]
  F --> G[ANN Search]
  D --> G
  G --> H[Top-K Results]
  H --> I[RAG / Semantic Search]

Working with Chroma

Chroma is an open-source embedding database that runs locally with zero external dependencies. It stores documents alongside their embeddings and metadata. You can add collections, insert documents, and query by similarity in a few lines of code. Chroma automatically handles the embedding step if you pass text directly, or you can provide precomputed embeddings for custom models.

import chromadb
from chromadb.utils import embedding_functions

client = chromadb.Client()
sentence_transformer_ef = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name='all-MiniLM-L6-v2'
)

collection = client.create_collection(
    name='docs',
    embedding_function=sentence_transformer_ef
)

collection.add(
    documents=[
        "Vector databases enable semantic search",
        "Similarity search finds nearest neighbors",
        "RAG retrieves context for LLM generation]
    ],
    ids=['doc1', 'doc2', 'doc3']
)

results = collection.query(
    query_texts=["How do vector databases find similar content?"],
    n_results=2
)

print(f"Query results: {results['documents'][0]}")
print(f"Distances: {results['distances'][0]}")

Expected output:

Query results: ['Vector databases enable semantic search', 'Similarity search finds nearest neighbors']
Distances: [0.789, 0.654]

Working with Pinecone

Pinecone is a fully managed vector database that scales to billions of vectors. You create an index, specify the dimension and similarity metric, and Pinecone handles infrastructure, Replication, and indexing. The serverless option scales to zero when unused and automatically provisions resources on demand. Pinecone supports metadata filtering, allowing you to combine vector similarity with structured filters like category or date range.

from pinecone import Pinecone, ServerlessSpec
import os

pc = Pinecone(api_key=os.environ['PINECONE_API_KEY'])

if 'example-index' not in pc.list_indexes().names():
    pc.create_index(
        name='example-index',
        dimension=384,
        metric='cosine',
        spec=ServerlessSpec(cloud='aws', region='us-east-1')
    )

index = pc.Index('example-index')

vectors = [
    {"id": "vec1", "values": embeddings[0].tolist()},
    {"id": "vec2", "values": embeddings[1].tolist()},
    {"id": "vec3", "values": embeddings[2].tolist()}
]
index.upsert(vectors=vectors)

query_results = index.query(
    vector=embeddings[0].tolist(),
    top_k=2,
    include_values=False
)

print(f"Matches: {query_results['matches']}")

Expected output:

Matches: [{'id': 'vec1', 'score': 1.0}, {'id': 'vec2', 'score': 0.43}]

Working with Weaviate

Weaviate is an open-source vector database with built-in modules for automatic vectorization, question answering, and generative search. You define a schema with classes and properties, and Weaviate auto-vectorizes data using configured modules. Its GraphQL API supports hybrid search combining vector similarity with keyword BM25 ranking. Weaviate also supports multi-tenancy, sharding, and Replication for production deployments.

import weaviate
import weaviate.classes as wvc

client = weaviate.connect_to_local()

if client.collections.exists("Document"):
    client.collections.delete("Document")

collection = client.collections.create(
    name="Document",
    vectorizer_config=wvc.Configure.Vectorizer.none(),
    properties=[
        wvc.Property(name="title", data_type=wvc.DataType.TEXT),
        wvc.Property(name="content", data_type=wvc.DataType.TEXT)
    ]
)

collection.data.insert({
    "title": "Vector DB Guide",
    "content": "Vector databases store embeddings for AI similarity search"
})

response = collection.query.near_text(
    query="storing embeddings",
    limit=2
)

for obj in response.objects:
    print(f"Title: {obj.properties['title']}, Score: {obj.metadata.score:.3f}")

Expected output:

Title: Vector DB Guide, Score: 0.891

Vector Database Comparison

Feature	Chroma	Pinecone	Weaviate
Hosting	Local / embedded	Managed cloud	Self-hosted / cloud
Setup	Zero config	API key required	Docker / Kubernetes
Scaling	Single node	Billions of vectors	Sharded clusters
Built-in embedding	Yes (via plugins)	No (bring your own)	Yes (modules)
Metadata filtering	Yes	Yes	Yes
Cost	Free	Pay per usage	Free tier available

Common Errors and Mistakes

Mistake	Why It Happens	How to Fix
Embedding dimension mismatch	Different models produce different dimensions	Use the same model for indexing and querying
Wrong similarity metric	Cosine vs Euclidean produce different rankings	Match the metric to your embedding model (cosine is default)
Not normalizing vectors	Some databases assume unit-length vectors	Normalize embeddings before insertion
Index not populated	Querying before vectors finish upserting	Wait for index to show ready status
Missing metadata filters	Irrelevant results pollute top-K	Apply filters for category, date, or source

Practice Questions

What is the difference between exact nearest neighbor search and approximate nearest neighbor search?

Answer: Exact search compares a query against every vector (guarantees accuracy but is O(n)). ANN uses indexing structures like HNSW to examine only a subset of vectors, trading negligible accuracy for orders-of-magnitude speed improvement.

Why must the same embedding model be used for indexing and querying?

Answer: Different models produce embeddings in different vector spaces with different dimensionalities and semantic distributions. Cosine similarity is only meaningful when vectors are embedded by the same model.

What is the role of metadata filtering in vector search?

Answer: Metadata filtering narrows the search space before similarity computation. For example, you can filter by date range before searching, ensuring results come only from relevant documents while leveraging vector similarity within that subset.

How does HNSW indexing accelerate vector search?

Answer: HNSW builds a multi-layer graph where each layer has fewer nodes. Search starts at the top layer (fewest nodes) and progressively narrows down, allowing logarithmic search time instead of linear scanning of all vectors.

What is hybrid search and why is it useful?

Answer: Hybrid search combines vector similarity (semantic) with keyword matching (BM25). It captures both semantic meaning and exact term matches, improving results for queries with specific terminology that vector search might miss.

Challenge

Build a document search system using Chroma. Ingest 50 documents (or text files), create embeddings using Sentence Transformers, store them in Chroma with metadata (category, date), and build a query interface that supports both similarity search and metadata filtering. Compare results with and without hybrid search.

Real-World Task

Design a vector database pipeline for a news article recommendation system. Articles are embedded as they are published and stored in Pinecone with metadata (topic, publish date, source). When a user reads an article, embed it, query the nearest neighbors, and recommend similar articles. Exclude the current article and filter by the user's preferred topics.

Next Steps

Apply vector databases in a RAG system by completing the Building RAG Systems tutorial. Explore Docker for self-hosting Weaviate and Kubernetes for scaling vector databases in production.

What is a vector embedding in machine learning?

A vector embedding is a numerical representation of data (text, image, audio) as a list of floating-point numbers. Similar items have embeddings that are close together in vector space, enabling similarity search by measuring distance between vectors.

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

← Previous Building RAG Systems: Retrieval-Augmented Generation Guide Next → Reinforcement Learning: Q-Learning, Deep RL and Practical Applications

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Machine Learning