Skip to content

19 Test Data Management

DodaTech 3 min read

title: "Test Data Management" description: "Manage test data for API testing including factory patterns, database seeding, test isolation, cleanup strategies, and data versioning for consistent and reliable automated API tests." weight: 19 date: 2026-06-28 lastmod: 2026-06-28 tags: [api-development, testing] }

Test data management ensures each test has the data it needs without depending on other tests. Strategies include factories for creating data, database seeding for common scenarios, transaction rollback for cleanup, and data versioning for reproducibility.

What You'll Learn

  • Test data factory patterns
  • Database seeding and migrations
  • Transaction-based test isolation
  • Data cleanup strategies
  • Data versioning for reproducibility

Why It Matters

Shared mutable state is the #1 cause of flaky tests. Proper test data management ensures tests are independent, repeatable, and maintainable. It prevents cascade failures where one test breaks another.

Real-World Use

Rails' FactoryBot pioneered test data factories. Django's test framework uses database transactions for isolation. Spring Boot's @DataJpaTest rolls back transactions automatically.

flowchart TD
    Test[Test Case] --> Factory[Data Factory]
    Factory --> Create[Create Test Data]
    Create --> Execute[Run Test]
    Execute --> Assert[Assertions]
    Execute --> Cleanup[Cleanup]
    Cleanup --> Rollback[Transaction Rollback]
    Cleanup --> Delete[Delete Created Records]

Teacher Mindset

Each test should create its own data and clean up after itself. Use factories for readable data creation. Use transactions for automatic cleanup. Never share mutable test data between tests.

Code Examples

// Example 1: Factory function pattern
const { faker } = require('@faker-js/faker');
const db = require('../db');

async function createUser(overrides = {}) {
  const user = {
    name: faker.person.fullName(),
    email: faker.internet.email(),
    role: 'user',
    ...overrides
  };

  const [id] = await db('users').insert(user).returning('id');
  return { id, ...user };
}

async function createProduct(overrides = {}) {
  return db('products').insert({
    name: faker.commerce.productName(),
    price: faker.commerce.price(),
    category: faker.commerce.department(),
    ...overrides
  }).returning('*');
}

// Test usage
test('user can create order', async () => {
  const user = await createUser();
  const product = await createProduct();

  const res = await request(app)
    .post('/api/orders')
    .set('Authorization', `Bearer ${token}`)
    .send({ userId: user.id, productId: product.id });

  expect(res.status).toBe(201);
});
# Example 2: pytest fixtures with data factories
import pytest
from factory import Factory, Faker, django

class UserFactory(Factory):
    class Meta:
        model = User

    name = Faker('name')
    email = Faker('email')
    role = 'user'

@pytest.fixture
def user():
    return UserFactory()

@pytest.fixture
def admin_user():
    return UserFactory(role='admin')

@pytest.fixture
def user_token(user):
    return create_token(user.id)

@pytest.mark.django_db
def test_user_can_view_profile(client, user, user_token):
    response = client.get(
        '/api/users/me',
        HTTP_AUTHORIZATION=f'Bearer {user_token}'
    )
    assert response.status_code == 200
    assert response.json()['email'] == user.email
// Example 3: Transaction-based cleanup
const { Transaction } = require('knex');

beforeEach(async () => {
  // Start a transaction
  trx = await db.transaction();
});

afterEach(async () => {
  // Rollback the transaction, cleaning all test data
  await trx.rollback();
});

test('creates user in transaction', async () => {
  const [user] = await trx('users').insert({
    name: 'Test User',
    email: 'test@test.com'
  }).returning('*');

  const res = await request(app)
    .get(`/api/users/${user.id}`)
    .expect(200);

  expect(res.body.name).toBe('Test User');
  // Transaction will be rolled back
});

Common Mistakes

  • Sharing test data between tests through module-level variables
  • Depending on specific database IDs that change between runs
  • Not cleaning up test data, causing database bloat
  • Using hardcoded data instead of factories
  • Running tests in parallel without database isolation

Practice

  1. Create a factory function for generating user test data.
  2. Write a test that creates its own data using the factory.
  3. Implement transaction-based cleanup in your test setup.
  4. Add a test that depends on specific data (admin user) using fixtures.
  5. Challenge: Build a data factory system that handles relationships (user with orders).

FAQ

What is the best test data strategy?

Each test creates its own data using factories. Use transactions for automatic cleanup. Never share data between tests.

Should I use database seeds for tests?

Seeds are for development. Tests should create data explicitly, not depend on seeds that may change.

How do I handle related data (foreign keys)?

Factories should create related data automatically. A createOrder factory should create the user and product if not provided.

What is the difference between factory and fixture?

A factory creates data. A fixture is a predefined data set. Factories are more flexible and maintainable.

How do I handle test data for read-only APIs?

Use database seeds or fixtures for reference data. Create test-specific data for dynamic scenarios.

Mini Project

Implement a test data management system for your API tests. Create factory functions for all entities (User, Product, Order). Use transaction-based cleanup. Write tests that demonstrate data isolation. Ensure tests can run in parallel without conflicts.

What's Next

Next, you will build a complete API testing project integrating all concepts.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro