Skip to content

Seaborn Tutorial — Statistical Data Visualization

DodaTech 2 min read

In this tutorial, you'll learn about Seaborn Tutorial. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

What You'll Learn

Create publication-quality statistical visualizations with Seaborn — distribution plots, categorical plots, relational plots, regression plots, and heatmaps.

Why It Matters

Seaborn makes complex statistical plots easy with one line of code. Its default styles are far more attractive than raw Matplotlib, and it integrates seamlessly with pandas.

Real-World Use

Visualizing correlations in a dataset, comparing distributions across categories, creating a pairplot for exploratory analysis, or making a heatmap of a confusion matrix.

Setup

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Set Seaborn style
sns.set_theme(style="whitegrid")
# Other styles: "darkgrid", "ticks", "dark", "white"

Built-in Datasets

# Load example datasets for practice
tips = sns.load_dataset("tips")
iris = sns.load_dataset("iris")
titanic = sns.load_dataset("titanic")

# tips: restaurant tips data
print(tips.head())
#    total_bill   tip     sex smoker  day    time  size
# 0       16.99  1.01  Female     No  Sun  Dinner     2
# 1       10.34  1.66    Male     No  Sun  Dinner     3

Distribution Plots

Histogram with KDE

plt.figure(figsize=(10, 6))
sns.histplot(tips["total_bill"], bins=30, kde=True)
plt.title("Distribution of Total Bills")
plt.show()

KDE Plot

plt.figure(figsize=(10, 6))
sns.kdeplot(tips["total_bill"], fill=True, alpha=0.3)
sns.kdeplot(tips["tip"], fill=True, alpha=0.3)
plt.legend(["Total Bill", "Tip"])
plt.show()

Categorical Plots

Box Plot

plt.figure(figsize=(10, 6))
sns.boxplot(x="day", y="total_bill", data=tips)
plt.title("Total Bill Distribution by Day")
plt.show()

Violin Plot

plt.figure(figsize=(10, 6))
sns.violinplot(x="day", y="total_bill", data=tips, inner="quartile")
plt.title("Total Bill Distribution by Day (Violin)")
plt.show()

Count Plot

plt.figure(figsize=(8, 5))
sns.countplot(x="day", data=tips, hue="sex")
plt.title("Number of Transactions by Day")
plt.show()

Relational Plots

Scatter Plot

plt.figure(figsize=(10, 6))
sns.scatterplot(x="total_bill", y="tip", data=tips, hue="time", size="size")
plt.title("Tip vs Total Bill")
plt.show()

Line Plot

# Aggregate by time
daily = tips.groupby("day")["total_bill"].mean().reset_index()
sns.lineplot(x="day", y="total_bill", data=daily, marker="o")
plt.title("Average Total Bill by Day")
plt.show()

Regression Plot

plt.figure(figsize=(10, 6))
sns.regplot(x="total_bill", y="tip", data=tips, ci=95)
plt.title("Tip vs Total Bill with Regression Line")
plt.show()

Heatmap (Correlation Matrix)

# Compute correlation matrix
corr = tips.select_dtypes(include=[np.number]).corr()

plt.figure(figsize=(8, 6))
sns.heatmap(corr,
            annot=True,
            cmap="coolwarm",
            vmin=-1, vmax=1,
            linewidths=0.5,
            fmt=".2f")
plt.title("Correlation Matrix of Tips Dataset")
plt.show()

Pairplot

# Scatter matrix — great for EDA
sns.pairplot(iris, hue="species", height=2.5)
plt.suptitle("Iris Dataset Pairplot", y=1.02)
plt.show()

FacetGrid — Multiple Subplots by Category

g = sns.FacetGrid(tips, col="time", row="sex", height=4)
g.map(sns.histplot, "total_bill", bins=20)
g.fig.suptitle("Total Bill Distribution by Time and Gender", y=1.02)
plt.show()

Customizing Seaborn

# Set context for different purposes
sns.set_context("paper")      # Small fonts, for papers
sns.set_context("notebook")   # Default
sns.set_context("talk")       # Larger, for presentations
sns.set_context("poster")     # Largest, for posters

# Custom color palettes
sns.set_palette("husl")
sns.set_palette("Set2")
sns.set_palette("colorblind")

# Create your own palette
custom_palette = ["#e74c3c", "#3498db", "#2ecc71"]
sns.set_palette(custom_palette)

# Remove top and right spines
sns.despine()

Quick Reference

Plot Function Use Case
Histogram histplot() Distribution of one variable
KDE kdeplot() Smoothed distribution
Box plot boxplot() Distribution by category
Violin plot violinplot() Distribution + density by category
Scatter scatterplot() Two numeric variables
Line lineplot() Trend over time
Regression regplot() Scatter + regression line
Heatmap heatmap() Correlation matrix
Pairplot pairplot() All pairwise relationships
Count countplot() Counts by category

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro