Data Visualization Best Practices

DodaTech 3 min read

In this tutorial, you'll learn about Data Visualization Best Practices. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

What You'll Learn

Create effective data visualizations — choose the right chart type, apply design principles, use color effectively, ensure Accessibility, and avoid common mistakes.

Why It Matters

A bad chart can mislead. A good chart reveals insights instantly. Visualization is the most powerful tool for communicating data — if done right.

Real-World Use

Creating a dashboard for executive stakeholders, presenting analysis findings to non-technical teams, or publishing charts in a research paper.

Choosing the Right Chart

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

chart_guide = pd.DataFrame({
    "Goal": [
        "Compare categories",
        "Show trends over time",
        "Show distribution",
        "Show relationship",
        "Show composition",
        "Show ranking",
        "Show correlation",
    ],
    "Best Chart": [
        "Bar chart",
        "Line chart",
        "Histogram / Box plot",
        "Scatter plot",
        "Stacked bar / Pie",
        "Bar chart (sorted)",
        "Heatmap",
    ],
    "Alternative": [
        "Column chart",
        "Area chart",
        "Violin / KDE",
        "Hexbin",
        "Treemap",
        "Lollipop chart",
        "Pairplot",
    ],
})

Design Principles

1. Remove Chart Junk

# ❌ Too much decoration
plt.figure(figsize=(10, 6))
plt.plot(x, y, color="blue", linewidth=2)
plt.title("📈 Our Amazing Sales Performance Growth Trajectory!!!")
plt.grid(True, alpha=0.8, linestyle="-")
# ... too much visual noise

# ✅ Clean and minimal
plt.figure(figsize=(10, 6))
plt.plot(x, y, color="#3498db", linewidth=2)
plt.title("Monthly Sales Revenue")
plt.grid(True, alpha=0.3, linestyle="--")

2. Start Axes at Zero (for Bar Charts)

# ❌ Misleading: bar chart not starting at zero
plt.figure(figsize=(8, 4))
plt.bar(["A", "B", "C"], [95, 97, 99])
plt.ylim(90, 100)  # Exaggerates differences
# B looks 3× bigger than A when it's only 2% different

# ✅ Start at zero
plt.ylim(0, 100)

3. Sort Your Data

# ❌ Unsorted
plt.bar(categories, values)

# ✅ Sorted by value
sorted_idx = np.argsort(values)
plt.bar(np.array(categories)[sorted_idx], np.array(values)[sorted_idx])

Color Theory

# Color palettes by use case
color_palettes = {
    "categorical": ["#3498db", "#e74c3c", "#2ecc71", "#f39c12", "#9b59b6"],
    "sequential": ["#f7fbff", "#deebf7", "#c6dbef", "#9ecae1", "#6baed6"],
    "diverging": ["#d73027", "#fc8d59", "#fee090", "#e0f3f8", "#4575b4"],
}

# Use ColorBrewer palettes (accessible)
sns.color_palette("Set2")
sns.color_palette("viridis", 10)
sns.color_palette("Blues", 5)
sns.color_palette("RdBu", 7)

Accessibility

# 1. Colorblind-friendly palettes
sns.color_palette("colorblind")
sns.color_palette("viridis")  # Perceptually uniform

# 2. Add markers and patterns (not just color)
plt.plot(x, y1, marker="o", linestyle="-", label="Series A")
plt.plot(x, y2, marker="s", linestyle="--", label="Series B")

# 3. Sufficient contrast
# Light text on light background = bad
# Dark text on light background = good

# 4. Font sizes
plt.title("Chart Title", fontsize=16)
plt.xlabel("X Axis", fontsize=12)
plt.ylabel("Y Axis", fontsize=12)
plt.xticks(fontsize=10)

Common Mistakes

Mistake	Why It's Bad	Fix
Pie charts with 10+ slices	Impossible to read	Use bar chart or treemap
3D charts	Distorts perception	Stick to 2D
Dual y-axes	Confusing, can mislead	Use separate charts
Cherry-picked time range	Misleading trend	Show full data
Missing context	Can't interpret values	Add labels, annotations
Too many colors	Visual noise	Stick to 5-7 colors
No data labels	Reader must guess	Add labels

Dashboard Design

def create_dashboard(df, date_col, metrics, category_col):
    """Create a simple analytical dashboard layout."""
    fig = plt.figure(figsize=(16, 10))

    # Layout: 2 columns, 3 rows
    gs = fig.add_gridspec(3, 2)

    # Row 1: KPI cards (custom, or use ax.text)
    ax_kpi1 = fig.add_subplot(gs[0, 0])
    ax_kpi1.text(0.5, 0.6, f"${df[metrics[0]].sum():,.0f}",
                 ha="center", fontsize=28, fontweight="bold")
    ax_kpi1.text(0.5, 0.3, "Total Revenue",
                 ha="center", fontsize=14)
    ax_kpi1.axis("off")

    # Row 2: Time series
    ax_trend = fig.add_subplot(gs[1, :])
    df.groupby(date_col)[metrics[0]].sum().plot(ax=ax_trend)
    ax_trend.set_title("Revenue Trend")

    # Row 3: Bar chart by category
    ax_bar = fig.add_subplot(gs[2, 0])
    df.groupby(category_col)[metrics[0]].sum().plot(kind="bar", ax=ax_bar)
    ax_bar.set_title("Revenue by Category")
    ax_bar.tick_params(axis="x", rotation=45)

    # Row 3: Distribution
    ax_hist = fig.add_subplot(gs[2, 1])
    df[metrics[0]].hist(ax=ax_hist, bins=20, edgecolor="black")
    ax_hist.set_title(f"{metrics[0]} Distribution")

    plt.tight_layout()
    return fig

Chart Type Decision Tree

What do you want to show?
├── Comparison between categories → Bar chart
├── Trend over time → Line chart
├── Distribution of values → Histogram / Box plot
├── Relationship between 2 variables → Scatter plot
├── Relationship between 3+ variables → Pairplot / Heatmap
├── Part of a whole → Stacked bar / Treemap
├── Geographic data → Map
├── Ranking → Sorted bar / Lollipop
└── Text data → Word cloud / Bar of top terms

Checklist

Is the chart type appropriate for the data?
Are axes labeled with units?
Is the scale appropriate (no misleading truncation)?
Are colors accessible (colorblind-friendly)?
Is the title clear and descriptive?
Are important points annotated?
Is the chart clean (no chart junk)?
Are sources cited if data is external?

← Previous Correlation Analysis with Pandas and Seaborn Next → Building a Data Analysis Pipeline with Python

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Data Science