Particle Systems

Compute-driven particles

Why Particles

Fire, smoke, sparks, rain, explosions, magic effects—so much of what makes visuals feel alive comes from particle systems. Each particle is simple: a point with position, velocity, and a limited lifespan. But thousands of them, updated and rendered in parallel, create complex emergent behavior.

Particle systems are a natural fit for the GPU. Every particle is independent. Its update logic depends only on its own state and some global forces. This is embarrassingly parallel workload—exactly what compute shaders excel at.

Interactive: Particle System

Spawn Rate50/sec

Initial Speed150

Lifetime2.0s

Gravity100

Spread Angle180°

Each particle stores position, velocity, lifetime, and appearance. Every frame, we update physics and cull dead particles.

Each slider controls a different aspect of the simulation. Notice how changing parameters affects the visual character: high gravity creates waterfalls, negative gravity makes rising smoke, wide spread angles create explosions.

Particles as Data

A particle is just a struct—a bundle of numbers that define its current state:

struct Particle {
  position: vec2<f32>,
  velocity: vec2<f32>,
  lifetime: f32,
  max_lifetime: f32,
  size: f32,
  color: vec4<f32>,
}

wgsl

On the GPU, particles live in a storage buffer. Each particle occupies a fixed number of bytes, laid out contiguously. Thread 0 processes particle 0, thread 1 processes particle 1, and so on.

The choice of what data to store depends on the effect you want. A simple fire might need only position, velocity, and lifetime. A complex magic effect might track rotation, scale animation curves, texture coordinates, and custom parameters.

Interactive: Particle Lifecycle

Spawn

Active

0.5s

Fading

Dead

Particle State at t=0.0s

Position: (100, 200)

Velocity: (50, -30)

Life remaining: 3.0s

Alpha: 1.00

Each particle carries its own state through its lifecycle: spawn → active → fading → removed.

The lifecycle is straightforward: spawn with initial values, update each frame until lifetime reaches zero, then remove or recycle. The alpha value often derives from remaining lifetime—particles fade as they die.

The Compute Update

Each frame, a compute shader updates every particle. The pattern is simple but powerful:

@compute @workgroup_size(256)
fn update(@builtin(global_invocation_id) id: vec3<u32>) {
  let index = id.x;
  if (index >= particle_count) { return; }
  
  var p = particles[index];
  
  // Skip dead particles
  if (p.lifetime <= 0.0) { return; }
  
  // Physics integration
  p.velocity += gravity * dt;
  p.position += p.velocity * dt;
  
  // Age the particle
  p.lifetime -= dt;
  
  // Write back
  particles[index] = p;
}

wgsl

Interactive: Compute Update Steps

In a compute shader, thousands of threads execute this logic in parallel—one thread per particle.

The key insight is that each particle update is independent. No particle reads another particle's data. This means thousands of updates happen simultaneously with no synchronization needed.

For more complex physics—collisions with geometry, inter-particle forces, fluid-like behavior—the compute shader grows but the pattern remains: read particle state, compute new state, write back.

Spawning Particles

Emitters control where and how particles are born. A point emitter spawns all particles at a single location. A line emitter spreads them along a segment. A mesh emitter spawns from surface points.

Interactive: Emitter Configuration

Emitter Shape

Velocity Mode

Spawn Rate100/sec

Click and drag to emit particles. The emitter shape determines spawn positions, velocity mode determines initial directions.

Spawn logic runs either on the CPU (uploading new particles each frame) or on the GPU (using atomic counters to claim slots in the particle buffer). GPU-side spawning is faster for high spawn rates, but CPU-side is simpler and sufficient for many effects.

Initial velocity is critical to the particle's behavior. Directional velocity creates streams. Randomized velocity creates bursts. Outward velocity from the spawn point creates explosions. The emitter defines not just where particles appear but how they begin their journey.

Rendering Particles

Once updated, particles need to reach the screen. The simplest approach is point sprites—each particle is a single vertex that the rasterizer expands to a screen-aligned square. Fast, but limited: you cannot rotate the sprite or control its shape.

Billboard quads give more control. Each particle becomes four vertices forming a camera-facing rectangle. You can rotate the quad, apply texture coordinates, and vary the shape. The vertex shader positions the corners; the fragment shader applies the texture.

Interactive: Render Mode Comparison

Different rendering approaches trade off quality against performance. Point sprites are simplest, textured billboards look best.

Textured billboards look best. A soft circular texture with transparency creates the characteristic particle glow. Additive blending layers particles without hard edges—bright areas become brighter as particles overlap.

The rendering challenge is overdraw. A dense particle cloud might have dozens of particles per pixel, each requiring a texture sample and blend operation. Sorting particles back-to-front ensures correct transparency but costs performance. Often, additive blending lets you skip sorting entirely—the visual result is order-independent.

The Full Pipeline

A complete particle system runs two passes per frame:

The update pass is a compute dispatch. It reads the particle buffer, updates physics and lifetimes, and writes results. Dead particles are marked but not removed—removal would require compaction, which is expensive.

The render pass draws all particles. For point sprites, one draw call suffices. For billboards, you might use instancing: the instance ID indexes into the particle buffer to fetch position and appearance.

// Vertex shader for billboards
@vertex
fn vs(@builtin(instance_index) instance: u32,
      @builtin(vertex_index) vertex: u32) -> VertexOutput {
  let p = particles[instance];
  
  // Skip dead particles by placing them off-screen
  if (p.lifetime <= 0.0) {
    return VertexOutput(vec4(-10.0, -10.0, -10.0, 1.0), vec2(0.0), 0.0);
  }
  
  // Billboard corner offsets
  let corners = array<vec2<f32>, 4>(
    vec2(-1.0, -1.0), vec2(1.0, -1.0),
    vec2(-1.0, 1.0), vec2(1.0, 1.0)
  );
  let offset = corners[vertex] * p.size;
  
  // Camera-facing orientation
  let world_pos = vec3(p.position + offset, 0.0);
  let clip_pos = view_proj * vec4(world_pos, 1.0);
  
  let alpha = p.lifetime / p.max_lifetime;
  return VertexOutput(clip_pos, (corners[vertex] + 1.0) * 0.5, alpha);
}

wgsl

The fragment shader samples the particle texture and applies the computed alpha. Additive blending combines overlapping particles into the final image.

Double buffering is essential when the compute pass both reads and writes particles. Use two buffers and swap them each frame: read from buffer A, write to buffer B, then flip. This avoids race conditions where a thread reads a particle that another thread just modified.

Key Takeaways

Particles are independent data points—position, velocity, lifetime, appearance
Compute shaders update thousands of particles in parallel each frame
Emitters control spawn location, rate, and initial velocity
Point sprites are fast; billboard quads offer more control; textured billboards look best
The pipeline: compute update → render with instancing → additive blending
Double buffering prevents read/write conflicts in the compute pass