Seaborn Tutorial — Statistical Data Visualization
In this tutorial, you'll learn about Seaborn Tutorial. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
What You'll Learn
Create publication-quality statistical visualizations with Seaborn — distribution plots, categorical plots, relational plots, regression plots, and heatmaps.
Why It Matters
Seaborn makes complex statistical plots easy with one line of code. Its default styles are far more attractive than raw Matplotlib, and it integrates seamlessly with pandas.
Real-World Use
Visualizing correlations in a dataset, comparing distributions across categories, creating a pairplot for exploratory analysis, or making a heatmap of a confusion matrix.
Setup
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# Set Seaborn style
sns.set_theme(style="whitegrid")
# Other styles: "darkgrid", "ticks", "dark", "white"
Built-in Datasets
# Load example datasets for practice
tips = sns.load_dataset("tips")
iris = sns.load_dataset("iris")
titanic = sns.load_dataset("titanic")
# tips: restaurant tips data
print(tips.head())
# total_bill tip sex smoker day time size
# 0 16.99 1.01 Female No Sun Dinner 2
# 1 10.34 1.66 Male No Sun Dinner 3
Distribution Plots
Histogram with KDE
plt.figure(figsize=(10, 6))
sns.histplot(tips["total_bill"], bins=30, kde=True)
plt.title("Distribution of Total Bills")
plt.show()
KDE Plot
plt.figure(figsize=(10, 6))
sns.kdeplot(tips["total_bill"], fill=True, alpha=0.3)
sns.kdeplot(tips["tip"], fill=True, alpha=0.3)
plt.legend(["Total Bill", "Tip"])
plt.show()
Categorical Plots
Box Plot
plt.figure(figsize=(10, 6))
sns.boxplot(x="day", y="total_bill", data=tips)
plt.title("Total Bill Distribution by Day")
plt.show()
Violin Plot
plt.figure(figsize=(10, 6))
sns.violinplot(x="day", y="total_bill", data=tips, inner="quartile")
plt.title("Total Bill Distribution by Day (Violin)")
plt.show()
Count Plot
plt.figure(figsize=(8, 5))
sns.countplot(x="day", data=tips, hue="sex")
plt.title("Number of Transactions by Day")
plt.show()
Relational Plots
Scatter Plot
plt.figure(figsize=(10, 6))
sns.scatterplot(x="total_bill", y="tip", data=tips, hue="time", size="size")
plt.title("Tip vs Total Bill")
plt.show()
Line Plot
# Aggregate by time
daily = tips.groupby("day")["total_bill"].mean().reset_index()
sns.lineplot(x="day", y="total_bill", data=daily, marker="o")
plt.title("Average Total Bill by Day")
plt.show()
Regression Plot
plt.figure(figsize=(10, 6))
sns.regplot(x="total_bill", y="tip", data=tips, ci=95)
plt.title("Tip vs Total Bill with Regression Line")
plt.show()
Heatmap (Correlation Matrix)
# Compute correlation matrix
corr = tips.select_dtypes(include=[np.number]).corr()
plt.figure(figsize=(8, 6))
sns.heatmap(corr,
annot=True,
cmap="coolwarm",
vmin=-1, vmax=1,
linewidths=0.5,
fmt=".2f")
plt.title("Correlation Matrix of Tips Dataset")
plt.show()
Pairplot
# Scatter matrix — great for EDA
sns.pairplot(iris, hue="species", height=2.5)
plt.suptitle("Iris Dataset Pairplot", y=1.02)
plt.show()
FacetGrid — Multiple Subplots by Category
g = sns.FacetGrid(tips, col="time", row="sex", height=4)
g.map(sns.histplot, "total_bill", bins=20)
g.fig.suptitle("Total Bill Distribution by Time and Gender", y=1.02)
plt.show()
Customizing Seaborn
# Set context for different purposes
sns.set_context("paper") # Small fonts, for papers
sns.set_context("notebook") # Default
sns.set_context("talk") # Larger, for presentations
sns.set_context("poster") # Largest, for posters
# Custom color palettes
sns.set_palette("husl")
sns.set_palette("Set2")
sns.set_palette("colorblind")
# Create your own palette
custom_palette = ["#e74c3c", "#3498db", "#2ecc71"]
sns.set_palette(custom_palette)
# Remove top and right spines
sns.despine()
Quick Reference
| Plot | Function | Use Case |
|---|---|---|
| Histogram | histplot() |
Distribution of one variable |
| KDE | kdeplot() |
Smoothed distribution |
| Box plot | boxplot() |
Distribution by category |
| Violin plot | violinplot() |
Distribution + density by category |
| Scatter | scatterplot() |
Two numeric variables |
| Line | lineplot() |
Trend over time |
| Regression | regplot() |
Scatter + regression line |
| Heatmap | heatmap() |
Correlation matrix |
| Pairplot | pairplot() |
All pairwise relationships |
| Count | countplot() |
Counts by category |
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro