Data Visualization Best Practices
DodaTech
3 min read
In this tutorial, you'll learn about Data Visualization Best Practices. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
What You'll Learn
Create effective data visualizations — choose the right chart type, apply design principles, use color effectively, ensure Accessibility, and avoid common mistakes.
Why It Matters
A bad chart can mislead. A good chart reveals insights instantly. Visualization is the most powerful tool for communicating data — if done right.
Real-World Use
Creating a dashboard for executive stakeholders, presenting analysis findings to non-technical teams, or publishing charts in a research paper.
Choosing the Right Chart
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
chart_guide = pd.DataFrame({
"Goal": [
"Compare categories",
"Show trends over time",
"Show distribution",
"Show relationship",
"Show composition",
"Show ranking",
"Show correlation",
],
"Best Chart": [
"Bar chart",
"Line chart",
"Histogram / Box plot",
"Scatter plot",
"Stacked bar / Pie",
"Bar chart (sorted)",
"Heatmap",
],
"Alternative": [
"Column chart",
"Area chart",
"Violin / KDE",
"Hexbin",
"Treemap",
"Lollipop chart",
"Pairplot",
],
})
Design Principles
1. Remove Chart Junk
# ❌ Too much decoration
plt.figure(figsize=(10, 6))
plt.plot(x, y, color="blue", linewidth=2)
plt.title("📈 Our Amazing Sales Performance Growth Trajectory!!!")
plt.grid(True, alpha=0.8, linestyle="-")
# ... too much visual noise
# ✅ Clean and minimal
plt.figure(figsize=(10, 6))
plt.plot(x, y, color="#3498db", linewidth=2)
plt.title("Monthly Sales Revenue")
plt.grid(True, alpha=0.3, linestyle="--")
2. Start Axes at Zero (for Bar Charts)
# ❌ Misleading: bar chart not starting at zero
plt.figure(figsize=(8, 4))
plt.bar(["A", "B", "C"], [95, 97, 99])
plt.ylim(90, 100) # Exaggerates differences
# B looks 3× bigger than A when it's only 2% different
# ✅ Start at zero
plt.ylim(0, 100)
3. Sort Your Data
# ❌ Unsorted
plt.bar(categories, values)
# ✅ Sorted by value
sorted_idx = np.argsort(values)
plt.bar(np.array(categories)[sorted_idx], np.array(values)[sorted_idx])
Color Theory
# Color palettes by use case
color_palettes = {
"categorical": ["#3498db", "#e74c3c", "#2ecc71", "#f39c12", "#9b59b6"],
"sequential": ["#f7fbff", "#deebf7", "#c6dbef", "#9ecae1", "#6baed6"],
"diverging": ["#d73027", "#fc8d59", "#fee090", "#e0f3f8", "#4575b4"],
}
# Use ColorBrewer palettes (accessible)
sns.color_palette("Set2")
sns.color_palette("viridis", 10)
sns.color_palette("Blues", 5)
sns.color_palette("RdBu", 7)
Accessibility
# 1. Colorblind-friendly palettes
sns.color_palette("colorblind")
sns.color_palette("viridis") # Perceptually uniform
# 2. Add markers and patterns (not just color)
plt.plot(x, y1, marker="o", linestyle="-", label="Series A")
plt.plot(x, y2, marker="s", linestyle="--", label="Series B")
# 3. Sufficient contrast
# Light text on light background = bad
# Dark text on light background = good
# 4. Font sizes
plt.title("Chart Title", fontsize=16)
plt.xlabel("X Axis", fontsize=12)
plt.ylabel("Y Axis", fontsize=12)
plt.xticks(fontsize=10)
Common Mistakes
| Mistake | Why It's Bad | Fix |
|---|---|---|
| Pie charts with 10+ slices | Impossible to read | Use bar chart or treemap |
| 3D charts | Distorts perception | Stick to 2D |
| Dual y-axes | Confusing, can mislead | Use separate charts |
| Cherry-picked time range | Misleading trend | Show full data |
| Missing context | Can't interpret values | Add labels, annotations |
| Too many colors | Visual noise | Stick to 5-7 colors |
| No data labels | Reader must guess | Add labels |
Dashboard Design
def create_dashboard(df, date_col, metrics, category_col):
"""Create a simple analytical dashboard layout."""
fig = plt.figure(figsize=(16, 10))
# Layout: 2 columns, 3 rows
gs = fig.add_gridspec(3, 2)
# Row 1: KPI cards (custom, or use ax.text)
ax_kpi1 = fig.add_subplot(gs[0, 0])
ax_kpi1.text(0.5, 0.6, f"${df[metrics[0]].sum():,.0f}",
ha="center", fontsize=28, fontweight="bold")
ax_kpi1.text(0.5, 0.3, "Total Revenue",
ha="center", fontsize=14)
ax_kpi1.axis("off")
# Row 2: Time series
ax_trend = fig.add_subplot(gs[1, :])
df.groupby(date_col)[metrics[0]].sum().plot(ax=ax_trend)
ax_trend.set_title("Revenue Trend")
# Row 3: Bar chart by category
ax_bar = fig.add_subplot(gs[2, 0])
df.groupby(category_col)[metrics[0]].sum().plot(kind="bar", ax=ax_bar)
ax_bar.set_title("Revenue by Category")
ax_bar.tick_params(axis="x", rotation=45)
# Row 3: Distribution
ax_hist = fig.add_subplot(gs[2, 1])
df[metrics[0]].hist(ax=ax_hist, bins=20, edgecolor="black")
ax_hist.set_title(f"{metrics[0]} Distribution")
plt.tight_layout()
return fig
Chart Type Decision Tree
What do you want to show?
├── Comparison between categories → Bar chart
├── Trend over time → Line chart
├── Distribution of values → Histogram / Box plot
├── Relationship between 2 variables → Scatter plot
├── Relationship between 3+ variables → Pairplot / Heatmap
├── Part of a whole → Stacked bar / Treemap
├── Geographic data → Map
├── Ranking → Sorted bar / Lollipop
└── Text data → Word cloud / Bar of top terms
Checklist
- Is the chart type appropriate for the data?
- Are axes labeled with units?
- Is the scale appropriate (no misleading truncation)?
- Are colors accessible (colorblind-friendly)?
- Is the title clear and descriptive?
- Are important points annotated?
- Is the chart clean (no chart junk)?
- Are sources cited if data is external?
← Previous
Correlation Analysis with Pandas and Seaborn
Next →
Building a Data Analysis Pipeline with Python
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro