Skip to content

Data Science & Analytics

Data science tutorials — pandas, NumPy, Matplotlib, Seaborn, data cleaning, exploratory data analysis, visualization, and building data pipelines from scratch

82 Published

In this tutorial, you will learn about Data Science. We cover key concepts, practical examples, and best practices to help you master this topic.

Comprehensive data science tutorials covering everything from qubits and Superposition to advanced algorithms and real-world applications.

Fundamentals

Data Science Introduction: What It Is and Why It Matters
Data Science Lifecycle: CRISP-DM and Cross-Industry Process
Python for Data Science: Setup and Core Libraries Overview
Data Types and Structures for Data Science Applications
Exploratory Data Analysis: Initial Data Inspection Techniques
Data Science Ethics: Bias, Privacy, and Responsible AI
Data Science Project Types: Descriptive, Diagnostic, Predictive, Prescriptive

Career & Learning

Data Science Roadmap: Skills, Tools, and Learning Path for Beginners
Data Scientist Portfolio: Building Projects and Effective GitHub Presence
Data Science Interviews: Technical Questions, Case Studies, and System Design
Data Science Certifications: Coursera, AWS, Google, and Microsoft Options
Data Science Communities: Conferences, Forums, and Networking Events
MLOps Career Path: From Data Scientist to Machine Learning Engineer

Additional Classic Tutorials

Correlation Analysis with Pandas and Seaborn
Building a Data Analysis Pipeline with Python
Data Cleaning Techniques and Best Practices with Python
Data Normalization and Standardization Techniques
Data Science Projects for Beginners -- Build Your Portfolio
Data Storytelling and Presentation -- Communicate Insights Effectively
Data Visualization Best Practices
Data Visualization with Matplotlib and Seaborn -- Complete Guide
Data Wrangling with Pandas -- Reshape, Pivot, and Stack
Feature Engineering for Machine Learning
Matplotlib Tutorial -- Basic Plotting with Python
Advanced NumPy Operations -- Broadcasting, Vectorization, and Performance
NumPy Broadcasting Explained -- Vectorized Operations
NumPy Tutorial -- Arrays and Operations
Outlier Detection and Treatment in Python
Working with CSV, Excel, and SQL in Pandas
Pandas Data Cleaning -- Handling Missing Data and Duplicates
Pandas Data Manipulation Guide -- Filter, Group, and Transform
Pandas GroupBy and Aggregation -- Analyze Data by Categories
Pandas Merge, Join, and Concatenate -- Combining DataFrames
Pandas Tutorial -- DataFrames and Series Explained
Python for Data Science -- Complete Setup Guide
Seaborn Tutorial -- Statistical Data Visualization
Statistical Hypothesis Testing Guide with Python
Time Series Analysis with Python -- Complete Guide
Time Series Analysis with Pandas
Web Scraping for Data Science with Python

Published Topics

Python for Data Science — Complete Setup Guide

Python data science setup guide — install Anaconda, set up Jupyter Notebook, install pandas, NumPy, Matplotlib, and create your first data science project

✓ Live

Pandas Tutorial — DataFrames and Series Explained

Pandas tutorial for beginners — create and manipulate DataFrames and Series, load data from CSV, filter rows, select columns, and basic operations

✓ Live

Pandas Data Cleaning — Handling Missing Data and Duplicates

Pandas data cleaning tutorial — detect and handle missing values, remove duplicates, fix data types, handle outliers, and prepare data for analysis

✓ Live

Pandas GroupBy and Aggregation — Analyze Data by Categories

Pandas GroupBy tutorial — group data by categories, calculate aggregate statistics, use multiple aggregation functions, and apply custom transformations

✓ Live

Pandas Merge, Join, and Concatenate — Combining DataFrames

Pandas merge join concat tutorial — combine DataFrames with merge (SQL-like joins), join on index, concatenate rows or columns, and handle overlapping data

✓ Live

NumPy Tutorial — Arrays and Operations

NumPy tutorial for beginners — create arrays, reshape, slice, broadcast, perform math operations, and understand NumPy's speed advantage over Python lists

✓ Live

NumPy Broadcasting Explained — Vectorized Operations

NumPy broadcasting explained — how NumPy automatically expands arrays of different shapes, broadcasting rules, common patterns, and performance benefits

✓ Live

Matplotlib Tutorial — Basic Plotting with Python

Matplotlib tutorial for beginners — line plots, bar charts, scatter plots, histograms, customizing figures, saving plots, and creating publication-ready charts

✓ Live

Seaborn Tutorial — Statistical Data Visualization

Seaborn tutorial — create beautiful statistical plots with Seaborn, use built-in themes, plot distributions, relationships, categories, and customize visualizations

✓ Live

Data Wrangling with Pandas — Reshape, Pivot, and Stack

Pandas data wrangling tutorial — pivot tables, melt (unpivot), stack and unstack, cross-tabulations, and reshaping data for analysis

✓ Live

Time Series Analysis with Pandas

Time series analysis with pandas — datetime indexing, resampling, rolling windows, shifting, date ranges, and forecasting with time series data

✓ Live

Working with CSV, Excel, and SQL in Pandas

Pandas I/O tutorial — read and write CSV files, Excel spreadsheets, SQL databases, JSON, Parquet, and handle large datasets efficiently

✓ Live

Feature Engineering for Machine Learning

Feature engineering guide — create numeric features, encode categorical variables, handle dates, create interaction features, and select the best features for ML models

✓ Live

Outlier Detection and Treatment in Python

Outlier detection methods in Python — IQR method, Z-score, isolation forest, DBSCAN, visualization techniques, and how to treat or remove outliers

✓ Live

Data Normalization and Standardization Techniques

Data normalization and standardization explained — Min-Max scaling, Z-score standardization, robust scaling, and when to use each technique for machine learning

✓ Live

Correlation Analysis with Pandas and Seaborn

Correlation analysis in Python — Pearson, Spearman, and Kendall correlation, correlation matrices, heatmaps, pairplots, and interpreting correlation results

✓ Live

Data Visualization Best Practices

Data visualization best practices — choose the right chart type, design principles, color theory, accessibility, and creating effective dashboards

✓ Live

Building a Data Analysis Pipeline with Python

Data analysis pipeline tutorial — build a complete ETL pipeline in Python with pandas, automated data cleaning, feature engineering, and reporting

✓ Live

Data Science Projects for Beginners — Build Your Portfolio

Data science project ideas for beginners — build a portfolio with real datasets, from exploratory analysis to predictive models and visualization dashboards

✓ Live

Data Visualization with Matplotlib and Seaborn — Complete Guide

Data visualization with Python using Matplotlib and Seaborn — create publication-quality plots, customize themes, build subplots, and choose the right chart for your data

✓ Live

Pandas Data Manipulation Guide — Filter, Group, and Transform

Pandas data manipulation guide — filter rows, select columns, group by categories, aggregate with multiple functions, apply custom transformations, and reshape DataFrames

✓ Live

Time Series Analysis with Python — Complete Guide

Time series analysis with Python using pandas and statsmodels — datetime indexing, resampling, rolling windows, decomposition, stationarity tests, and forecasting fundamentals

✓ Live

Statistical Hypothesis Testing Guide with Python

Statistical hypothesis testing guide with Python — t-tests, ANOVA, chi-square, p-values explained, assumptions, and when to use each test with real datasets

✓ Live

Data Cleaning Techniques and Best Practices with Python

Data cleaning techniques and best practices with Python using pandas — handle missing values, remove duplicates, standardize formats, detect outliers, and build cleaning pipelines

✓ Live

Advanced NumPy Operations — Broadcasting, Vectorization, and Performance

Advanced NumPy operations guide — broadcasting rules, vectorized computations, universal functions, structured arrays, linear algebra, and performance optimization techniques

✓ Live

Web Scraping for Data Science with Python

Web scraping for data science with Python using Beautiful Soup, Requests, and Selenium — extract data from HTML, handle JavaScript-rendered pages, and build ethical scrapers

✓ Live

Data Storytelling and Presentation — Communicate Insights Effectively

Data storytelling and presentation guide with Python — structure narratives, design impactful visuals, avoid misleading charts, and communicate data insights to stakeholders

✓ Live

Data Science Introduction: What It Is and Why It Matters

Learn the fundamentals of data science including its definition, core disciplines, real-world applications, and how it drives decision-making across industries.

✓ Live

Data Science Lifecycle: CRISP-DM and Cross-Industry Process

Learn the CRISP-DM data science lifecycle framework covering business understanding, data preparation, modeling, evaluation, deployment and iteration stages.

✓ Live

Python for Data Science: Setup and Core Libraries Overview

Learn how to set up Python for data science with NumPy, Pandas, Matplotlib, and Scikit-Learn including environment configuration and package management.

✓ Live

Data Types and Structures for Data Science Applications

Learn the essential data types and structures used in data science including arrays, DataFrames, series, and tensors for efficient data analysis and modeling.

✓ Live

Exploratory Data Analysis: Initial Data Inspection Techniques

Learn exploratory data analysis techniques including summary statistics, distribution analysis, correlation matrices, and visual inspection for uncovering data.

✓ Live

Data Science Ethics: Bias, Privacy, and Responsible AI

Learn data science ethics including algorithmic bias, data privacy regulations, fairness metrics, and responsible AI development practices for analysts.

✓ Live

Data Science Project Types: Descriptive, Diagnostic, Predictive, Prescriptive

Learn the four types of data science projects descriptive, diagnostic, predictive, and prescriptive analytics with examples and appropriate techniques for each.

✓ Live

Descriptive Statistics: Mean, Median, Mode, Variance, and Standard Deviation

Learn descriptive statistics including measures of central tendency mean median mode and measures of dispersion variance standard deviation and range.

✓ Live

Probability Basics: Distributions, Bayes Theorem, and Random Variables

Learn probability fundamentals including probability distributions, Bayes theorem, conditional probability, random variables, and expected value for data.

✓ Live

Hypothesis Testing: T-Tests, Chi-Square, P-Values, and Significance Levels

Learn hypothesis testing including null and alternative hypotheses, t-tests, chi-square tests, p-values, and statistical significance interpretation for data.

✓ Live

Correlation and Causation: Covariance, Pearson, and Spearman Rank

Learn the difference between correlation and causation including Pearson and Spearman correlation coefficients, covariance matrices, and spurious correlations.

✓ Live

Inferential Statistics: Confidence Intervals and Sampling Methods

Learn inferential statistics including confidence intervals, margin of error, sampling distributions, and central limit theorem for parameter estimation.

✓ Live

Bayesian Statistics: Prior, Likelihood, Posterior, and Conjugate Priors

Learn Bayesian statistics from prior beliefs to posterior distributions including conjugate priors and Bayesian inference with practical examples for data.

✓ Live

Probability Distributions: Normal, Binomial, Poisson, and Exponential Types

Learn common probability distributions including normal, binomial, Poisson, and exponential with parameters, properties, and real-world data applications.

✓ Live

Data Cleaning with Pandas: Handling Missing Values and Duplicates

Learn data cleaning techniques using Pandas including handling missing values, removing duplicates, fixing data types, and standardizing text data for analysis.

✓ Live

Data Transformation: Reshaping, Aggregation, and Merging DataFrames

Learn data transformation with Pandas including pivot tables, group by aggregations, merging, joining, concatenating and reshaping DataFrames for analysis.

✓ Live

Working with SQL: Querying Databases for Data Analysis

Learn SQL for data analysis including SELECT, WHERE, JOIN, GROUP BY, HAVING, window functions, and subqueries to extract insights from relational databases.

✓ Live

Data Import and Export: CSV, JSON, Excel, and Database Formats

Learn how to import and export data in multiple formats including CSV, JSON, Excel, Parquet, and direct database connections using Pandas and SQLAlchemy.

✓ Live

Dealing with Outliers: Detection, Analysis, and Treatment Methods

Learn outlier detection methods including Z-score, IQR, isolation forests, and DBSCAN plus treatment strategies like capping, transformation, and imputation.

✓ Live

Feature Scaling: Normalization, Standardization, and Robust Methods

Learn feature scaling techniques including min-max normalization, z-score standardization, robust scaling, and when to apply each method for ML models.

✓ Live

Text Data Processing: Tokenization, Stop Words, and Regex Patterns

Learn text data processing techniques including tokenization, stop word removal, stemming, lemmatization, regular expressions, and bag-of-words representations.

✓ Live

Supervised Learning: Classification and Regression Problem Types

Learn supervised learning fundamentals including classification and regression problems, labeled data, loss functions, and model selection strategies.

✓ Live

Linear Regression: Simple and Multiple Regression Techniques

Learn linear regression from simple to multiple predictors including ordinary least squares, assumptions, coefficient interpretation, and model diagnostics.

✓ Live

Logistic Regression: Binary Classification and Probability Estimation

Learn logistic regression for binary classification including sigmoid function, decision boundaries, odds ratios, and maximum likelihood estimation techniques.

✓ Live

Decision Trees and Random Forests: Ensemble Learning for Classification

Learn decision trees and random forests including tree construction, Gini impurity, information gain, bagging, feature importance, and ensemble methods.

✓ Live

Support Vector Machines: Kernel Tricks and Margin Maximization

Learn support vector machines including maximum margin hyperplanes, kernel tricks for non-linear data, soft margins, and SVM parameter tuning techniques.

✓ Live

K-Nearest Neighbors: Distance-Based Classification and Regression

Learn K-nearest neighbors for classification and regression including distance metrics, K value selection, curse of dimensionality, and weighted voting.

✓ Live

Model Evaluation: Accuracy, Precision, Recall, F1, ROC, and AUC

Learn model evaluation metrics including accuracy, precision, recall, F1 score, ROC curves, AUC, confusion matrices, and cross-validation strategies for ML.

✓ Live

Neural Networks Basics: Perceptrons, Activation Functions, and Backpropagation

Learn neural network fundamentals including perceptrons, activation functions like ReLU and sigmoid, backpropagation, and gradient descent optimization.

✓ Live

Convolutional Neural Networks: Image Classification and CNN Architectures

Learn convolutional neural networks for image classification including convolution layers, pooling, filters, feature maps, and common CNN architectures.

✓ Live

Recurrent Neural Networks: Sequence Models and LSTMs Explained

Learn recurrent neural networks for sequence data including LSTMs, GRUs, vanishing gradients, and sequence-to-sequence models for time series analysis.

✓ Live

Transfer Learning: Pre-Trained Models and Fine-Tuning Strategies

Learn transfer learning with pre-trained model selection, fine-tuning strategies, feature extraction, domain adaptation, and practical Keras implementation.

✓ Live

Natural Language Processing: Tokenization, Embeddings, and Transformers

Learn natural language processing fundamentals including tokenization, word embeddings, transformers, attention mechanisms, BERT, and GPT for text analysis.

✓ Live

Deep Learning Frameworks: TensorFlow vs PyTorch for Model Building

Learn the differences between TensorFlow and PyTorch frameworks including computation graphs, eager execution, model deployment, and ecosystem comparisons.

✓ Live

Generative Models: GANs, VAEs, and Diffusion Model Architectures

Learn generative models including generative adversarial networks, variational autoencoders, and diffusion models for creating synthetic data and images.

✓ Live

Matplotlib Basics: Line Plots, Bar Charts, Histograms, and Scatter Plots

Learn Matplotlib fundamentals for creating line plots, bar charts, histograms, scatter plots, and customizing figures with labels, legends, and color schemes.

✓ Live

Seaborn Statistical Plots: Heatmaps, Pairplots, Box Plots, and Violin Plots

Learn Seaborn for statistical visualization including heatmaps, pairplots, box plots, and violin plots with built-in statistical aggregations and styling.

✓ Live

Interactive Visualizations: Plotly and Bokeh for Web Dashboards

Learn interactive data visualization with Plotly and Bokeh including zoom, pan, hover tooltips, animations, and embedding visualizations in web applications.

✓ Live

Storytelling with Data: Effective Dashboard Design Principles

Learn data storytelling principles including narrative structure, dashboard design, visual hierarchy, and annotation techniques for effective data presentation.

✓ Live

Time Series Visualization: Trends, Seasonality, and Forecasting Plots

Learn time series visualization including trend lines, seasonal decomposition plots, rolling averages, and autocorrelation plots for forecasting analysis.

✓ Live

Geospatial Data Visualization: Maps and Choropleth Techniques

Learn geospatial data visualization including choropleth maps, point maps, heat maps, shapefiles, GeoJSON, and interactive mapping with Folium and Plotly.

✓ Live

Dashboard Frameworks: Building with Streamlit and Dash Applications

Learn dashboard development with Streamlit and Dash including widgets, callbacks, layout design, and deployment for building interactive data applications.

✓ Live

Jupyter Notebooks: Interactive Computing for Data Science Workflows

Learn Jupyter Notebooks for data science including markdown cells, code execution, magic commands, kernel management, and sharing published notebooks.

✓ Live

Version Control for Data Science: Git and DVC for Reproducibility

Learn version control for data science projects using Git for code and DVC for data and model versioning to ensure reproducibility and team collaboration.

✓ Live

SQL for Data Science: Advanced Queries and Window Functions

Learn advanced SQL for data science including window functions, CTEs, pivot tables, date truncation, percentile calculations, and query optimization techniques.

✓ Live

Data Pipelines and ETL: Automating Data Workflows End-to-End

Learn ETL pipeline design including data extraction from APIs and databases, transformation logic, warehouse loading, and scheduling with Apache Airflow.

✓ Live

Cloud Platforms for Data Science: AWS, GCP, and Azure Services

Learn cloud platforms for data science including AWS SageMaker, GCP Vertex AI, Azure ML Studio, cloud storage, serverless computing, and cost management.

✓ Live

Big Data Tools: Apache Spark, Hadoop, and Distributed Computing

Learn big data processing with Apache Spark for distributed computing, Hadoop HDFS and MapReduce, cluster management, and data partitioning strategies.

✓ Live

Model Deployment: REST APIs, Docker, and MLOps Pipeline Basics

Learn ML model deployment including REST API serving with Flask FastAPI, Docker containerization, CI/CD pipelines, model monitoring, and MLOps fundamentals.

✓ Live

Data Science Roadmap: Skills, Tools, and Learning Path for Beginners

Learn the complete data science roadmap including essential skills, recommended tools, learning resources, and project ideas for career advancement path.

✓ Live

Data Scientist Portfolio: Building Projects and Effective GitHub Presence

Learn to build a data scientist portfolio with impactful projects, polished GitHub repositories, data storytelling, and presentation for job applications.

✓ Live

Data Science Interviews: Technical Questions, Case Studies, and System Design

Learn how to ace data science interviews including technical coding questions, statistics and ML theory, case study frameworks, and system design preparation.

✓ Live

Data Science Certifications: Coursera, AWS, Google, and Microsoft Options

Learn about top data science certifications including Coursera specializations, AWS Certified Data Analytics, Google Professional Data Engineer, and Azure Data.

✓ Live

Data Science Communities: Conferences, Forums, and Networking Events

Learn about data science communities including Kaggle competitions, Stack Overflow, Reddit forums, conferences like NeurIPS and KDD, and local meetup groups.

✓ Live

MLOps Career Path: From Data Scientist to Machine Learning Engineer

Learn about the MLOps career path including skills needed, roles and responsibilities, CI/CD for ML, model monitoring, infrastructure tools, and progression.

✓ Live

All 82 topics in Data Science — Complete Guide are published.