Skip to content

Data Visualization Best Practices

DodaTech 3 min read

In this tutorial, you'll learn about Data Visualization Best Practices. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

What You'll Learn

Create effective data visualizations — choose the right chart type, apply design principles, use color effectively, ensure Accessibility, and avoid common mistakes.

Why It Matters

A bad chart can mislead. A good chart reveals insights instantly. Visualization is the most powerful tool for communicating data — if done right.

Real-World Use

Creating a dashboard for executive stakeholders, presenting analysis findings to non-technical teams, or publishing charts in a research paper.

Choosing the Right Chart

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

chart_guide = pd.DataFrame({
    "Goal": [
        "Compare categories",
        "Show trends over time",
        "Show distribution",
        "Show relationship",
        "Show composition",
        "Show ranking",
        "Show correlation",
    ],
    "Best Chart": [
        "Bar chart",
        "Line chart",
        "Histogram / Box plot",
        "Scatter plot",
        "Stacked bar / Pie",
        "Bar chart (sorted)",
        "Heatmap",
    ],
    "Alternative": [
        "Column chart",
        "Area chart",
        "Violin / KDE",
        "Hexbin",
        "Treemap",
        "Lollipop chart",
        "Pairplot",
    ],
})

Design Principles

1. Remove Chart Junk

# ❌ Too much decoration
plt.figure(figsize=(10, 6))
plt.plot(x, y, color="blue", linewidth=2)
plt.title("📈 Our Amazing Sales Performance Growth Trajectory!!!")
plt.grid(True, alpha=0.8, linestyle="-")
# ... too much visual noise

# ✅ Clean and minimal
plt.figure(figsize=(10, 6))
plt.plot(x, y, color="#3498db", linewidth=2)
plt.title("Monthly Sales Revenue")
plt.grid(True, alpha=0.3, linestyle="--")

2. Start Axes at Zero (for Bar Charts)

# ❌ Misleading: bar chart not starting at zero
plt.figure(figsize=(8, 4))
plt.bar(["A", "B", "C"], [95, 97, 99])
plt.ylim(90, 100)  # Exaggerates differences
# B looks 3× bigger than A when it's only 2% different

# ✅ Start at zero
plt.ylim(0, 100)

3. Sort Your Data

# ❌ Unsorted
plt.bar(categories, values)

# ✅ Sorted by value
sorted_idx = np.argsort(values)
plt.bar(np.array(categories)[sorted_idx], np.array(values)[sorted_idx])

Color Theory

# Color palettes by use case
color_palettes = {
    "categorical": ["#3498db", "#e74c3c", "#2ecc71", "#f39c12", "#9b59b6"],
    "sequential": ["#f7fbff", "#deebf7", "#c6dbef", "#9ecae1", "#6baed6"],
    "diverging": ["#d73027", "#fc8d59", "#fee090", "#e0f3f8", "#4575b4"],
}

# Use ColorBrewer palettes (accessible)
sns.color_palette("Set2")
sns.color_palette("viridis", 10)
sns.color_palette("Blues", 5)
sns.color_palette("RdBu", 7)

Accessibility

# 1. Colorblind-friendly palettes
sns.color_palette("colorblind")
sns.color_palette("viridis")  # Perceptually uniform

# 2. Add markers and patterns (not just color)
plt.plot(x, y1, marker="o", linestyle="-", label="Series A")
plt.plot(x, y2, marker="s", linestyle="--", label="Series B")

# 3. Sufficient contrast
# Light text on light background = bad
# Dark text on light background = good

# 4. Font sizes
plt.title("Chart Title", fontsize=16)
plt.xlabel("X Axis", fontsize=12)
plt.ylabel("Y Axis", fontsize=12)
plt.xticks(fontsize=10)

Common Mistakes

Mistake Why It's Bad Fix
Pie charts with 10+ slices Impossible to read Use bar chart or treemap
3D charts Distorts perception Stick to 2D
Dual y-axes Confusing, can mislead Use separate charts
Cherry-picked time range Misleading trend Show full data
Missing context Can't interpret values Add labels, annotations
Too many colors Visual noise Stick to 5-7 colors
No data labels Reader must guess Add labels

Dashboard Design

def create_dashboard(df, date_col, metrics, category_col):
    """Create a simple analytical dashboard layout."""
    fig = plt.figure(figsize=(16, 10))

    # Layout: 2 columns, 3 rows
    gs = fig.add_gridspec(3, 2)

    # Row 1: KPI cards (custom, or use ax.text)
    ax_kpi1 = fig.add_subplot(gs[0, 0])
    ax_kpi1.text(0.5, 0.6, f"${df[metrics[0]].sum():,.0f}",
                 ha="center", fontsize=28, fontweight="bold")
    ax_kpi1.text(0.5, 0.3, "Total Revenue",
                 ha="center", fontsize=14)
    ax_kpi1.axis("off")

    # Row 2: Time series
    ax_trend = fig.add_subplot(gs[1, :])
    df.groupby(date_col)[metrics[0]].sum().plot(ax=ax_trend)
    ax_trend.set_title("Revenue Trend")

    # Row 3: Bar chart by category
    ax_bar = fig.add_subplot(gs[2, 0])
    df.groupby(category_col)[metrics[0]].sum().plot(kind="bar", ax=ax_bar)
    ax_bar.set_title("Revenue by Category")
    ax_bar.tick_params(axis="x", rotation=45)

    # Row 3: Distribution
    ax_hist = fig.add_subplot(gs[2, 1])
    df[metrics[0]].hist(ax=ax_hist, bins=20, edgecolor="black")
    ax_hist.set_title(f"{metrics[0]} Distribution")

    plt.tight_layout()
    return fig

Chart Type Decision Tree

What do you want to show?
├── Comparison between categories → Bar chart
├── Trend over time → Line chart
├── Distribution of values → Histogram / Box plot
├── Relationship between 2 variables → Scatter plot
├── Relationship between 3+ variables → Pairplot / Heatmap
├── Part of a whole → Stacked bar / Treemap
├── Geographic data → Map
├── Ranking → Sorted bar / Lollipop
└── Text data → Word cloud / Bar of top terms

Checklist

  • Is the chart type appropriate for the data?
  • Are axes labeled with units?
  • Is the scale appropriate (no misleading truncation)?
  • Are colors accessible (colorblind-friendly)?
  • Is the title clear and descriptive?
  • Are important points annotated?
  • Is the chart clean (no chart junk)?
  • Are sources cited if data is external?

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro