Camera Systems

Perspective and camera controls

The Camera Abstraction

A camera in 3D graphics is not a physical object—it is a mathematical construct that defines how the world maps onto a 2D image. The camera produces two matrices: the view matrix, which positions the world relative to the camera, and the projection matrix, which maps 3D space onto the 2D viewport.

Together, these matrices answer two questions: where is the camera looking, and how does it see? The view matrix handles the first, the projection matrix handles the second.

Building the View Matrix with lookAt

The most common way to construct a view matrix is the lookAt function. It takes three inputs:

Eye: Where the camera is positioned in world space.

Target: What point the camera is looking at.

Up: Which direction is "up" for the camera (usually the world's Y axis).

From these three vectors, lookAt constructs an orthonormal basis—three perpendicular unit vectors that define the camera's local coordinate system:

fn lookAt(eye: vec3f, target: vec3f, up: vec3f) -> mat4x4f {
    // Forward: direction from eye to target
    let forward = normalize(target - eye);
    
    // Right: perpendicular to forward and up
    let right = normalize(cross(forward, up));
    
    // Camera's actual up: perpendicular to forward and right
    let cameraUp = cross(right, forward);
    
    // Build the rotation part (transpose of the basis vectors)
    // Plus translation to move world so camera is at origin
    return mat4x4f(
        vec4f(right.x, cameraUp.x, -forward.x, 0.0),
        vec4f(right.y, cameraUp.y, -forward.y, 0.0),
        vec4f(right.z, cameraUp.z, -forward.z, 0.0),
        vec4f(-dot(right, eye), -dot(cameraUp, eye), dot(forward, eye), 1.0)
    );
}

wgsl

The view matrix effectively transforms the world so that the camera sits at the origin, looking down the negative Z axis. This standard orientation simplifies the projection step.

Interactive: LookAt Construction

Eye X5.0

Eye Y3.0

Eye Z5.0

Up Tilt0°

View Matrix

0.71

0.00

-0.71

0.00

-0.28

0.92

-0.28

-0.00

0.65

0.39

0.65

7.68

0.00

1.00

Vectors

eye = (5, 3, 5)

target = (0, 0, 0)

up = (0.00, 1.00, 0.00)

Move the eye, target, and up vectors to see how the view matrix changes. The camera frustum shows what the camera sees. Notice how the right vector is always perpendicular to both forward and up, ensuring the camera does not roll unintentionally.

Perspective Projection

Perspective projection mimics how we see the real world. Distant objects appear smaller. Parallel lines converge toward vanishing points on the horizon. This foreshortening effect creates depth perception on a flat screen.

Four parameters define a perspective projection:

Field of View (FOV): The vertical angle the camera sees, typically 45° to 90°. Wider FOV shows more of the scene but introduces distortion at the edges. Narrower FOV compresses depth, making distant objects appear closer.

Aspect Ratio: Width divided by height of the viewport. Must match your canvas dimensions to avoid stretching.

Near Plane: The closest distance the camera can see. Objects closer than this are clipped. Must be greater than zero.

Far Plane: The farthest distance the camera can see. Objects beyond this are clipped.

fn perspective(fov: f32, aspect: f32, near: f32, far: f32) -> mat4x4f {
    let f = 1.0 / tan(fov * 0.5);
    let rangeInv = 1.0 / (near - far);
    
    return mat4x4f(
        vec4f(f / aspect, 0.0, 0.0, 0.0),
        vec4f(0.0, f, 0.0, 0.0),
        vec4f(0.0, 0.0, far * rangeInv, -1.0),
        vec4f(0.0, 0.0, near * far * rangeInv, 0.0)
    );
}

wgsl

The -1.0 in the third column is what makes this a perspective projection. It copies the negative Z coordinate into the W component of the output. When the GPU performs perspective division (dividing by W), points farther from the camera are divided by larger values, making them appear smaller.

Interactive: Field of View

Field of View60°

Normal: Similar to human vision

fov = 60°tan(fov/2) = 0.577

Adjust the field of view and notice how the scene changes. A narrow FOV (20-30°) creates a telephoto effect, compressing depth and magnifying distant objects. A wide FOV (90-120°) shows more of the scene but can feel distorted, like looking through a fish-eye lens.

Orthographic Projection

Orthographic projection ignores distance. Objects stay the same size whether they are near or far. Parallel lines remain parallel. There is no foreshortening.

This is useful for 2D games and UI rendering, CAD applications where true proportions matter, isometric views, and shadow mapping from a directional light's perspective.

fn orthographic(left: f32, right: f32, bottom: f32, top: f32, near: f32, far: f32) -> mat4x4f {
    let width = right - left;
    let height = top - bottom;
    let depth = far - near;
    
    return mat4x4f(
        vec4f(2.0 / width, 0.0, 0.0, 0.0),
        vec4f(0.0, 2.0 / height, 0.0, 0.0),
        vec4f(0.0, 0.0, -1.0 / depth, 0.0),
        vec4f(-(right + left) / width, -(top + bottom) / height, -near / depth, 1.0)
    );
}

wgsl

The W component stays 1, so perspective division has no effect. The matrix simply scales and translates coordinates into the normalized range.

Camera Control Patterns

Interactive applications need ways for users to control the camera. Different patterns suit different use cases.

Orbit Camera: Rotates around a fixed target point. The user drags to orbit, scrolls to zoom. The eye position moves on a sphere centered at the target. Orbit cameras work well for inspecting 3D models—the object stays centered while the viewpoint changes.

fn orbitCamera(
    target: vec3f, 
    distance: f32, 
    azimuth: f32,   // horizontal angle
    elevation: f32  // vertical angle
) -> vec3f {
    let x = distance * cos(elevation) * sin(azimuth);
    let y = distance * sin(elevation);
    let z = distance * cos(elevation) * cos(azimuth);
    return target + vec3f(x, y, z);
}

wgsl

Interactive: Orbit Camera

Azimuth29°

Elevation23°

Distance8.0

Drag to orbit • Scroll to zoom

eye = (3.5, 3.6, 6.5)

Drag to orbit around the scene. Scroll to zoom in and out. The target (shown as a small sphere) stays fixed while the camera position changes. This is the default camera for most 3D viewers and editors.

Fly Camera: Moves freely through space like a drone. WASD controls movement, mouse controls direction. The camera has a position and a look direction, both updated by input.

First-Person Camera: Similar to fly camera but constrained. The up vector is always world-up (no rolling). Pitch is often limited to avoid looking directly up or down. Games use this pattern for player perspectives.

Track Camera: Follows a target object, maintaining a fixed offset or smoothly interpolating toward it. Useful for third-person games.

Aspect Ratio Handling

The aspect ratio must match your canvas dimensions, or the scene will appear stretched. If your canvas is 800×600 pixels, the aspect ratio is 800/600 ≈ 1.33.

Interactive: Aspect Ratio

Correct

Canvas is 4:3, projection uses 4:3. Circle is circular, square is square.

Canvas: 400×300 (1.33) • Projection: 1.33

Toggle the aspect ratio mismatch to see the effect. When the projection's aspect ratio does not match the viewport, circles become ellipses and squares become rectangles. The scene squashes or stretches to fill the available space.

Always recalculate the projection matrix when the canvas resizes:

function onResize() {
    const aspect = canvas.width / canvas.height;
    projectionMatrix = perspective(fov, aspect, near, far);
    // Upload new matrix to GPU
}
 
window.addEventListener("resize", onResize);

typescript

Depth Buffer Precision

The near and far planes affect depth buffer precision. The depth buffer stores values between 0 and 1, mapping the near plane to 0 and the far plane to 1. But this mapping is not linear for perspective projection—more precision is allocated near the camera.

If far/near is too large (say, 0.01 to 100000), distant objects will have nearly identical depth values, causing z-fighting: polygons flicker as they compete for the same depth buffer value.

Best practices:

Keep far/near ratio below 10000 when possible
Push the near plane as far as you can tolerate
Consider reversed depth (mapping near to 1, far to 0) for better precision distribution

Camera in WGSL

Here is a complete vertex shader using camera matrices:

struct CameraUniforms {
    view: mat4x4f,
    projection: mat4x4f,
    viewProjection: mat4x4f,  // precomputed P * V
    eye: vec3f,               // useful for lighting
    _pad: f32,
}
 
@group(0) @binding(0) var<uniform> camera: CameraUniforms;
 
struct ModelUniforms {
    model: mat4x4f,
}
 
@group(1) @binding(0) var<uniform> model: ModelUniforms;
 
struct VertexOutput {
    @builtin(position) clipPosition: vec4f,
    @location(0) worldPosition: vec3f,
}
 
@vertex
fn main(@location(0) position: vec3f) -> VertexOutput {
    var output: VertexOutput;
    
    let worldPos = model.model * vec4f(position, 1.0);
    output.worldPosition = worldPos.xyz;
    output.clipPosition = camera.viewProjection * worldPos;
    
    return output;
}

wgsl

The viewProjection matrix (projection × view) is precomputed on the CPU once per frame. Each object then only needs one matrix multiply with its model matrix, rather than two.

The camera's eye position is passed separately because it is needed for lighting calculations—specifically, for computing the view direction from each surface point back to the camera.

Key Takeaways

The view matrix transforms world space to camera space using lookAt(eye, target, up)
Perspective projection creates depth perception through foreshortening
FOV controls the visible angle; aspect ratio must match the viewport
Orthographic projection preserves parallel lines and ignores distance
Orbit cameras rotate around a target; fly cameras move freely through space
Aspect ratio mismatch causes stretching; recalculate projection on resize
Keep far/near ratio reasonable to avoid depth buffer precision issues