Database Performance Troubleshooting -- Slow Queries, Connection Pooling & Indexing

DodaTech Updated 2026-06-23 7 min read

Database performance degradation hits applications hard -- slow queries, exhausted connection pools, and replication lag turn fast APIs into unusable services -- this guide shows you how to diagnose, optimize, and prevent these issues in MySQL and PostgreSQL.

What You'll Learn

Why It Matters

A single slow query can block hundreds of others, cascade across services, and trigger downtime alerts at 3 AM. Knowing how to identify the bottleneck using query analysis, system metrics, and tuning parameters is essential for any backend developer or DBA.

Real-World Use

When your API response time jumps from 50ms to 5 seconds, your application reports "too many connections", or your read replicas fall minutes behind the primary during peak traffic, these optimization techniques restore performance.

Common Database Performance Issues Table

Issue	Symptom	Cause	Diagnostic Tool
Slow query	High response time, CPU spike	Missing index, bad join, full table scan	`EXPLAIN ANALYZE`, slow query log
Connection pool exhaustion	"Too many connections" error	Connections not released, pool too small	`SHOW PROCESSLIST`, `pg_stat_activity`
Lock contention	Queries stuck in "Waiting"	Long-running transaction holding locks	`SHOW ENGINE INNODB STATUS`, `pg_locks`
Replication lag	Replica behind by minutes	Large write volume, slow replica CPU	`SHOW SLAVE STATUS`, `pg_stat_Replication`
Table bloat	Table size much larger than data	Dead tuples (PostgreSQL) or fragmented pages (MySQL)	`VACUUM VERBOSE`, `OPTIMIZE TABLE`
Buffer pool miss	High disk I/O, slow reads	InnoDB buffer pool too small	`SHOW ENGINE INNODB STATUS` buffer stats

Step-by-Step Fixes

Fix 1: Find and Fix Slow Queries

-- MySQL: Enable slow query log
SET GLOBAL slow_query_log = ON;
SET GLOBAL long_query_time = 1;

-- Find top queries by total execution time
SELECT * FROM sys.statement_analysis
ORDER BY total_latency DESC
LIMIT 5;

-- Analyze query execution plan
EXPLAIN ANALYZE
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.created_at > '2026-01-01'
ORDER BY o.total DESC;

-- Add a composite index to speed it up
CREATE INDEX idx_orders_created_total
ON orders(created_at, total DESC);

Expected output:

+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key  | rows    | filtered | Extra          |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------------+
|  1 | SIMPLE      | orders| NULL       | ALL  | NULL          | NULL | 500000  | 33.33    | Using where    |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------------+

Fix 2: Resolve Connection Pool Exhaustion

# MySQL: Check current connections
mysql -e "SHOW STATUS LIKE 'Threads_connected';"
mysql -e "SHOW VARIABLES LIKE 'max_connections';"

# PostgreSQL: Check current connections
psql -c "SELECT count(*) FROM pg_stat_activity;"
psql -c "SHOW max_connections;"

# Kill idle connections in MySQL
mysql -e "SELECT CONCAT('KILL ', id, ';') FROM information_schema.PROCESSLIST WHERE COMMAND = 'Sleep' AND TIME > 300;" | tail -n +2 | mysql

# Python: Configure connection pooling properly
from sqlalchemy import create_engine

# Good pool settings for a web app
engine = create_engine(
    "mysql://user:pass@host/db",
    pool_size=10,           # Base connections
    max_overflow=10,        # Extra connections allowed
    pool_timeout=30,        # Seconds to wait for a connection
    pool_recycle=1800,      # Recycle connections after 30 min
    pool_pre_ping=True      # Check connection before using
)

Expected output:

+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| Threads_connected | 142  |
| max_connections   | 200  |
+-----------------+-------+

Fix 3: Investigate Lock Contention

-- MySQL: Check current locks
SELECT * FROM performance_schema.data_locks\G

-- Find blocking transactions
SELECT r.trx_id AS waiting_trx,
       r.trx_mysql_thread_id AS waiting_thread,
       b.trx_mysql_thread_id AS blocking_thread
FROM performance_schema.data_lock_waits w
JOIN information_schema.INNODB_TRX r ON w.requesting_trx_id = r.trx_id
JOIN information_schema.INNODB_TRX b ON w.blocking_trx_id = b.trx_id;

-- PostgreSQL: Check blocked queries
SELECT blocked.pid AS blocked_pid,
       blocked.query AS blocked_query,
       blocking.pid AS blocking_pid,
       blocking.query AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_stat_activity blocking ON blocking.pid = ANY(pg_blocking_pids(blocked.pid));

Expected output:

waiting_trx | waiting_thread | blocking_thread
12345       | 45             | 42

Fix 4: Diagnose Replication Lag

-- MySQL: Check replication status
SHOW SLAVE STATUS\G

-- PostgreSQL: Check replication status
SELECT application_name, state, sync_state,
       pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS lag_bytes
FROM pg_stat_replication;

# Monitor replication lag over time
watch -n 5 "mysql -e 'SHOW SLAVE STATUS\G' | grep -E 'Seconds_Behind_Master|Read_Master_Log_Pos'"

# Force sync if lag is critical (MySQL)
STOP SLAVE;
SET GLOBAL sql_slave_skip_counter = 1;
START SLAVE;

Expected output:

             Seconds_Behind_Master: 125
             Master_Log_File: mysql-bin.000042
             Read_Master_Log_Pos: 123456789

Fix 5: Reduce Table and Index Bloat

-- PostgreSQL: Vacuum and analyze
VACUUM VERBOSE ANALYZE orders;

-- Check table bloat percentage
SELECT schemaname, tablename,
       n_dead_tup,
       n_live_tup,
       round(n_dead_tup * 100.0 / GREATEST(n_live_tup + n_dead_tup, 1), 2) AS dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY dead_pct DESC;

-- MySQL: Optimize table (rebuilds and defragments)
OPTIMIZE TABLE orders;

-- Check table fragmentation
SELECT table_name, ROUND(data_length / 1024 / 1024, 2) AS data_mb,
       ROUND(index_length / 1024 / 1024, 2) AS index_mb,
       ROUND(data_free / 1024 / 1024, 2) AS free_mb
FROM information_schema.TABLES
WHERE table_schema = 'mydb' AND data_free > 1048576;

Expected output:

INFO:  vacuuming "public.orders"
INFO:  "orders": removed 50000 dead row versions

Database Performance Flowchart

flowchart TD
    A[Database Performance Issue] --> B{Check metric}
    B -->|Slow queries| C[Enable slow query log]
    C --> D[Run EXPLAIN ANALYZE]
    D --> E[Add missing index or rewrite query]
    B -->|Connection errors| F[Check thread count]
    F --> G[Increase max_connections or pool size]
    G --> H[Fix connection leak in app code]
    B -->|High lock waits| I[Find blocking transactions]
    I --> J[Optimize transaction duration]
    J --> K[Kill long-running blocking queries]
    B -->|Replication lag| L[Check slave status]
    L --> M[Upgrade replica or reduce write load]
    B -->|Disk I/O high| N[Check buffer pool size]
    N --> O[Increase innodb_buffer_pool_size]
    E --> P[Performance Restored]
    H --> P
    K --> P
    M --> P
    O --> P

Prevention Tips

Set long_query_time = 1 in production to capture all queries taking over 1 second
Use connection pooling in your application layer with pool_pre_ping=True to avoid stale connections
Schedule VACUUM (PostgreSQL) or OPTIMIZE TABLE during low-traffic windows for write-heavy tables
Monitor replication lag with Prometheus and alert if it exceeds 60 seconds
Set innodb_buffer_pool_size to 70-80% of available RAM for MySQL dedicated servers
Use pt-query-digest (Percona Toolkit) to analyze slow query logs offline

Practice Questions

How do you find which queries are consuming the most database time in MySQL? Answer: Enable the slow query log with SET GLOBAL slow_query_log = ON and long_query_time = 1. Then use pt-query-digest or query mysql.slow_log directly. For real-time analysis, use sys.statement_analysis.
What is the difference between max_connections and connection pooling? Answer: max_connections is the hard limit set on the database server -- no more connections than this can be established. Connection pooling reuses a limited set of database connections across multiple application requests, allowing hundreds of concurrent users without hitting max_connections.
How do you determine whether replication lag is caused by network latency or slow query execution on the replica? Answer: Check Seconds_Behind_Master (MySQL) or lag bytes (PostgreSQL). If the replica's CPU is high during lag, queries are the bottleneck. If CPU is low but lag exists, network bandwidth or disk I/O on the replica may be the issue.

Challenge: Write a SQL query that identifies the top 10 most-blocked queries in PostgreSQL by total wait time. Answer:

SELECT blocked.pid,
       blocked.query,
       blocked.wait_event_type,
       blocked.state,
       COUNT(blocking.pid) AS blockers,
       blocked.query_start
FROM pg_stat_activity blocked
LEFT JOIN LATERAL (
    SELECT unnest(pg_blocking_pids(blocked.pid)) AS blocking_pid
) blockers ON TRUE
WHERE blocked.wait_event_type = 'Lock'
GROUP BY blocked.pid, blocked.query, blocked.wait_event_type, blocked.state, blocked.query_start
ORDER BY COUNT(blocking.pid) DESC
LIMIT 10;

Quick Reference

Issue	Diagnostic	Resolution
Slow query	`EXPLAIN ANALYZE <query>`	Add index, optimize joins
Connection pool full	`SHOW PROCESSLIST`	Increase pool, fix leaks
Lock contention	`SHOW ENGINE INNODB STATUS`	Kill blocker, shorten transactions
Replication lag	`SHOW SLAVE STATUS`	Reduce writes, upgrade replica
Table bloat	`pg_stat_user_tables`	`VACUUM ANALYZE` or `OPTIMIZE TABLE`

FAQ

What is the ideal size for InnoDB buffer pool on a dedicated database server?

Set innodb_buffer_pool_size to 70-80% of total RAM on a dedicated MySQL server, but leave enough memory for OS cache, connections, and internal buffers. For servers with 64GB RAM, start with 48GB. Monitor buffer pool hit rate -- if it drops below 95%, consider increasing the size.

How do you handle connection surges that temporarily exceed max_connections?

Use a connection pooler like PgBouncer (PostgreSQL) or ProxySQL (MySQL) that sits between your application and the database. It maintains a persistent pool of database connections and queues application requests without exceeding the backend limit. Set max_connections 20-30% higher than your average peak to handle bursts.

Why does a query perform well in development but slow down in production?

The most common reasons are production data volume (full table scans on millions of rows vs thousands), production hardware configuration (different buffer pool, disk type, or memory), and concurrent query load (lock contention under real traffic). Always run EXPLAIN ANALYZE on production-like data volumes during development.

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

← Previous Docker Container Troubleshooting -- Health Checks, Resource Limits & Networking Next → Node.js Application Debugging -- Memory Leaks, Event Loop Blocking & Crash Analysis

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Troubleshooting