3D Transformations & Matrices — Translation, Rotation, Scaling and Projection

DodaTech Updated 2026-06-23 8 min read

In this tutorial, you'll learn about 3d transformations & matrices. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

3D transformation matrices are 4x4 linear algebra structures that encode translation, rotation, and scaling operations, enabling objects to be positioned, oriented, and projected through a sequence of coordinate spaces in the graphics pipeline.

What You'll Learn & Why It Matters

In this tutorial, you will learn how 3D transformations work mathematically. You will build translation, rotation, and scaling matrices, combine them into model-view-projection (MVP) matrices, and understand coordinate spaces from local to screen space.

Real-world use: Every 3D object in a game or film is positioned and animated through matrix transformations. Computer Graphics engines like Unity and Unreal multiply millions of matrices per frame for character animation, camera control, and physics simulation.

Prerequisites

Linear algebra fundamentals
C++ or Python basics
Rasterization Pipeline (previous)

Learning Path

flowchart LR
  A[Rasterization Pipeline] --> B[3D Transformations & Matrices]
  B --> C[Shader Programming]
  B --> D[Animation & Rigging]
  B --> E[OpenGL Guide]
  C --> F[Lighting Models]
  B:::current

  classDef current fill:#f90,color:#fff,stroke:#333,stroke-width:2px

The 4x4 Transformation Matrix

In homogeneous coordinates, a 3D point (x, y, z) becomes (x, y, z, w). A 4x4 matrix can represent translation, rotation, and scaling in a single multiplication:

#include <array>
#include <cmath>
#include <iostream>

struct Mat4 {
    std::array<float, 16> m = {0};

    Mat4() { m[0] = m[5] = m[10] = m[15] = 1.0f; }  // Identity

    static Mat4 identity() { return Mat4(); }

    static Mat4 translate(float tx, float ty, float tz) {
        Mat4 mat;
        mat.m[12] = tx;
        mat.m[13] = ty;
        mat.m[14] = tz;
        return mat;
    }

    static Mat4 scale(float sx, float sy, float sz) {
        Mat4 mat;
        mat.m[0] = sx;
        mat.m[5] = sy;
        mat.m[10] = sz;
        return mat;
    }

    static Mat4 rotateX(float angleDeg) {
        float rad = angleDeg * M_PI / 180.0f;
        float c = cos(rad), s = sin(rad);
        Mat4 mat;
        mat.m[5] = c;  mat.m[6] = -s;
        mat.m[9] = s;  mat.m[10] = c;
        return mat;
    }

    static Mat4 rotateY(float angleDeg) {
        float rad = angleDeg * M_PI / 180.0f;
        float c = cos(rad), s = sin(rad);
        Mat4 mat;
        mat.m[0] = c;  mat.m[2] = s;
        mat.m[8] = -s; mat.m[10] = c;
        return mat;
    }

    static Mat4 rotateZ(float angleDeg) {
        float rad = angleDeg * M_PI / 180.0f;
        float c = cos(rad), s = sin(rad);
        Mat4 mat;
        mat.m[0] = c;  mat.m[1] = -s;
        mat.m[4] = s;  mat.m[5] = c;
        return mat;
    }

    Mat4 operator*(const Mat4& rhs) const {
        Mat4 result;
        for (int row = 0; row < 4; row++) {
            for (int col = 0; col < 4; col++) {
                float sum = 0;
                for (int k = 0; k < 4; k++)
                    sum += m[row * 4 + k] * rhs.m[k * 4 + col];
                result.m[row * 4 + col] = sum;
            }
        }
        return result;
    }
};

int main() {
    Mat4 T = Mat4::translate(2.0f, 0.0f, 0.0f);
    Mat4 R = Mat4::rotateY(45.0f);
    Mat4 S = Mat4::scale(0.5f, 0.5f, 0.5f);

    Mat4 transform = T * R * S;
    std::cout << "TRS matrix created (translate x=2, rotate y=45, scale 0.5)" << std::endl;
    for (int i = 0; i < 4; i++) {
        for (int j = 0; j < 4; j++)
            std::cout << transform.m[i * 4 + j] << " ";
        std::cout << std::endl;
    }
    return 0;
}

The MVP Matrix Pipeline

Every 3D object passes through four coordinate spaces: local, world, view, and clip (projection):

import numpy as np

def create_perspective(fov_deg, aspect, near, far):
    fov_rad = np.radians(fov_deg)
    f = 1.0 / np.tan(fov_rad / 2.0)
    return np.array([
        [f / aspect, 0, 0, 0],
        [0, f, 0, 0],
        [0, 0, (far + near) / (near - far), (2 * far * near) / (near - far)],
        [0, 0, -1, 0]
    ], dtype=np.float32)

def create_look_at(eye, target, up):
    fwd = (target - eye) / np.linalg.norm(target - eye)
    right = np.cross(fwd, up)
    right = right / np.linalg.norm(right)
    up = np.cross(right, fwd)

    view = np.eye(4, dtype=np.float32)
    view[0, :3] = right
    view[1, :3] = up
    view[2, :3] = -fwd
    view[:3, 3] = -right.dot(eye), -up.dot(eye), fwd.dot(eye)
    return view

eye = np.array([3.0, 2.0, 5.0])
target = np.array([0.0, 0.0, 0.0])
up = np.array([0.0, 1.0, 0.0])

view = create_look_at(eye, target, up)
proj = create_perspective(45.0, 16.0/9.0, 0.1, 100.0)

model = np.eye(4, dtype=np.float32)
model[:3, :3] = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
model[:3, 3] = [0.0, 0.0, -2.0]

mvp = proj @ view @ model

point_local = np.array([0.5, 0.5, 0.5, 1.0])
point_clip = mvp @ point_local
point_ndc = point_clip[:3] / point_clip[3]
print(f"Local point: {point_local[:3]}")
print(f"Clip space: {point_clip[:3]}")
print(f"NDC: {point_ndc}")

Expected output:

Local point: [0.5 0.5 0.5]
Clip space: [ ... values ... ]
NDC: [ ... normalized values between -1 and 1 ... ]

Quaternions for Rotation

Quaternions avoid gimbal lock and provide smooth interpolation (SLERP) for animation:

#include <cmath>
#include <iostream>

struct Quaternion {
    float w, x, y, z;

    Quaternion(float w, float x, float y, float z) : w(w), x(x), y(y), z(z) {}

    static Quaternion fromAxisAngle(float ax, float ay, float az, float angleDeg) {
        float rad = angleDeg * M_PI / 180.0f;
        float s = sin(rad / 2.0f);
        return Quaternion(cos(rad / 2.0f), ax * s, ay * s, az * s);
    }

    Quaternion operator*(const Quaternion& q) const {
        return Quaternion(
            w * q.w - x * q.x - y * q.y - z * q.z,
            w * q.x + x * q.w + y * q.z - z * q.y,
            w * q.y - x * q.z + y * q.w + z * q.x,
            w * q.z + x * q.y - y * q.x + z * q.w
        );
    }

    float norm() const { return sqrt(w*w + x*x + y*y + z*z); }
    Quaternion normalized() const {
        float n = norm();
        return Quaternion(w/n, x/n, y/n, z/n);
    }

    static Quaternion slerp(const Quaternion& a, const Quaternion& b, float t) {
        float dot = a.w*b.w + a.x*b.x + a.y*b.y + a.z*b.z;
        if (dot < 0) { dot = -dot; /* use negated b */ }
        if (dot > 0.9995f) return a;  // nearly parallel, use linear
        float theta = acos(dot);
        float sinTheta = sin(theta);
        float wa = sin((1-t) * theta) / sinTheta;
        float wb = sin(t * theta) / sinTheta;
        return Quaternion(wa*a.w + wb*b.w, wa*a.x + wb*b.x, wa*a.y + wb*b.y, wa*a.z + wb*b.z);
    }
};

int main() {
    Quaternion q1 = Quaternion::fromAxisAngle(0, 1, 0, 0);
    Quaternion q2 = Quaternion::fromAxisAngle(0, 1, 0, 90);
    Quaternion interpolated = Quaternion::slerp(q1, q2, 0.5f);
    std::cout << "SLERP(0deg, 90deg, t=0.5): w=" << interpolated.w
              << " (angle = " << acos(interpolated.w) * 2 * 180 / M_PI << " deg)" << std::endl;
    return 0;
}

flowchart TD
  A[Local Space] -->|Model Matrix| B[World Space]
  B -->|View Matrix| C[View/Camera Space]
  C -->|Projection Matrix| D[Clip Space]
  D -->|Perspective Divide| E[NDC]
  E -->|Viewport Transform| F[Screen Space]
  
  style A fill:#4ae,color:#fff
  style F fill:#f90,color:#fff

Common Errors & Mistakes

1. Matrix Multiplication Order

Mistake: Applying transformations in the wrong order (e.g., translate then rotate instead of rotate then translate), causing the object to orbit instead of rotate in place.

Fix: Use transform = T * R * S (scale first, then rotate, then translate). Column-major multiplication reads right-to-left: the last operation happens first in local space.

2. Perspective Divide in Wrong Space

Mistake: Performing the perspective divide (w division) in vertex shader instead of letting the GPU handle it.

Fix: The GPU automatically performs the perspective divide after the vertex shader. Output homogeneous clip coordinates from the vertex shader and let the rasterizer handle normalization.

3. Degenerate Matrices

Mistake: Creating a projection matrix with near or far plane at zero, or a view matrix with zero-length look direction, producing NaN or inverted results.

Fix: Validate near/far values (near > 0, far > near). Ensure the look-at direction has non-zero length.

4. Gimbal Lock

Mistake: Using Euler angles for hierarchical rotation, causing one axis to lose a degree of freedom.

Fix: Use quaternions for interpolation and hierarchical rotations. Convert to a matrix only at the end for the shader.

Practice Questions

Question 1

Why are 4x4 matrices used instead of 3x3 for 3D transformations?

Show answer

A 4x4 matrix in homogeneous coordinates allows translation (which is an affine transformation) to be represented as a matrix multiplication. With 3x3 matrices, translation would require a separate vector addition, breaking the composability of matrix chains.

Question 2

What is the difference between model, view, and projection matrices?

Show answer

The model matrix transforms from local object space to world space. The view matrix transforms from world space to camera space. The projection matrix transforms from camera space to clip space, applying perspective or orthographic projection.

Question 3

How does a quaternion avoid gimbal lock?

Show answer

Gimbal lock occurs when using Euler angles (rotations around X, Y, Z axes sequentially) and two axes align, losing a degree of freedom. Quaternions represent rotation as a single rotation around an arbitrary axis, avoiding sequential axis dependencies.

Challenge

Implement a hierarchical transformation system for a robot arm with shoulder, elbow, and wrist joints. Use matrix stacks for parent-child transformations and animate each joint independently with interpolated rotations.

FAQ

What is the difference between row-major and column-major matrices?

Row-major stores elements row by row; column-major stores column by column. OpenGL uses column-major, DirectX uses row-major. The difference affects how you read and write matrix data from your code.

What is a normal matrix?

The normal matrix is the inverse transpose of the upper-left 3x3 of the model matrix. It transforms surface normals correctly when non-uniform scaling is applied. Use mat3(transpose(inverse(model))) in GLSL.

What is the viewport transform?

The viewport transform maps NDC coordinates (-1 to 1) to pixel coordinates on screen. It scales and translates based on the viewport dimensions set with glViewport(x, y, width, height).

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

Author: DodaTech | Last updated: June 23, 2026

DodaTech tutorials are built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro — security tools used by millions worldwide.

← Previous DirectX 12 Guide — Modern Graphics Programming on Windows Next → Bump, Normal and Displacement Mapping — Surface Detail Without Geometry

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Computer Graphics