Buffers and Data

Creating and uploading GPU data

Everything a GPU operates on lives in buffers. Vertex positions, transformation matrices, compute results—all of it resides in GPU-accessible memory. Understanding how to create, fill, and read buffers is fundamental to WebGPU programming.

GPU Memory Is Different

GPU memory and CPU memory are physically separate. Your JavaScript arrays live in system RAM, accessible to the CPU. GPU buffers live in video RAM (VRAM), accessible to the GPU. Data must be explicitly copied between them.

This separation exists for performance. GPU memory is optimized for the massively parallel access patterns that graphics and compute workloads demand. It offers much higher bandwidth when accessed in bulk. But the cost is that transferring data between CPU and GPU is slow compared to memory access within either processor.

The practical implication: minimize data transfers. Upload your mesh data once, not every frame. Keep uniform buffers small. Design your data layout to reduce round-trips between CPU and GPU.

Buffer Usage Flags

When you create a buffer, you must declare how it will be used. WebGPU enforces these declarations strictly—a buffer created without VERTEX usage cannot be bound as a vertex buffer, even if the data inside would work perfectly.

Interactive: Explore buffer usage flags

Buffer Usage Flags

Combined flags: 0x0028

Allowed Operations

Render as vertices
Pass to vertex shader
Requires: VERTEX
Indexed draw
Use as index buffer
Requires: INDEX
Bind as uniform
Bind to pipeline
Requires: UNIFORM
Compute read/write
Use in compute shader
Requires: STORAGE
Copy from
Use as copy source
Requires: COPY_SRC
Copy to / writeBuffer
Use as copy destination
Requires: COPY_DST
Map and read
Read back to CPU
Requires: MAP_READ
Map and write
Write from CPU
Requires: MAP_WRITE

Toggle usage flags to see what operations become available. Most buffers need COPY_DST to receive data via writeBuffer.

The usage flags are:

VERTEX marks a buffer as containing vertex data. The vertex shader can read from it during the input assembly stage.

INDEX marks a buffer as containing index data for indexed drawing. Each element references a vertex by index rather than duplicating vertex data.

UNIFORM marks a buffer as bindable to a uniform binding point. Uniform buffers are read-only in shaders and have size limits (typically 64KB), but access is highly optimized.

STORAGE marks a buffer as bindable to a storage binding point. Storage buffers can be read and written from shaders, have much larger size limits, and are the workhorse of compute shaders.

COPY_SRC and COPY_DST control whether a buffer can be a source or destination in copy operations. Most buffers need COPY_DST to receive data from writeBuffer().

MAP_READ and MAP_WRITE allow a buffer to be memory-mapped for CPU access. These are primarily for staging buffers used in data readback.

Creating Buffers

The device.createBuffer() method allocates a buffer with a specific size and usage:

Interactive: Configure buffer creation

Configuration

VERTEXCOPY_DST
Combined: 0x0028

Generated Code

const buffer = device.createBuffer({
  size: 1024,
  usage: GPUBufferUsage.VERTEX
       | GPUBufferUsage.COPY_DST
});

For mesh vertices. Use writeBuffer() to upload data after creation.

Three parameters matter:

size is the buffer size in bytes. Must be a multiple of 4 for most operations. Round up to alignment requirements when needed—uniform buffers often require 16-byte or 256-byte alignment.

usage is a bitwise combination of usage flags. Combine them with |:

usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST
javascript

mappedAtCreation optionally creates the buffer in a mapped state. This allows immediate CPU access without an async mapping call. The buffer starts mapped, you write data, then unmap to make it GPU-accessible.

const buffer = device.createBuffer({
  size: 1024,
  usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST,
  mappedAtCreation: true
});
 
const array = new Float32Array(buffer.getMappedRange());
array.set([0, 0.5, 0, -0.5, -0.5, 0, 0.5, -0.5, 0]);
buffer.unmap();
javascript

Once unmapped, the buffer is ready for GPU use. You cannot re-map a buffer created with mappedAtCreation unless it also has MAP_READ or MAP_WRITE usage.

Uploading Data with writeBuffer

For most cases, queue.writeBuffer() is the simplest way to upload data:

const data = new Float32Array([1.0, 2.0, 3.0, 4.0]);
device.queue.writeBuffer(buffer, 0, data);
javascript

The second argument is the byte offset in the buffer. The third is the data—any ArrayBuffer or typed array. The driver handles the transfer internally, potentially using staging buffers behind the scenes.

Interactive: Watch data flow from CPU to GPU

CPU MemoryGPU Bufferqueue.writeBuffer()
writeBuffer() is the simplest way to upload data. The driver handles staging internally.

writeBuffer is convenient but has overhead. For large, infrequent uploads it works well. For frequent small updates, consider other approaches like ring buffers or mapping.

Staging Buffers

Sometimes you need more control over the transfer process. A staging buffer is a CPU-accessible buffer used as an intermediate step:

// Create a staging buffer
const staging = device.createBuffer({
  size: dataSize,
  usage: GPUBufferUsage.MAP_WRITE | GPUBufferUsage.COPY_SRC,
  mappedAtCreation: true
});
 
// Write data while mapped
new Float32Array(staging.getMappedRange()).set(myData);
staging.unmap();
 
// Copy to GPU buffer
const encoder = device.createCommandEncoder();
encoder.copyBufferToBuffer(staging, 0, gpuBuffer, 0, dataSize);
device.queue.submit([encoder.finish()]);
javascript

This pattern gives you explicit control over when copies happen and can be more efficient for streaming scenarios where you continuously upload new data.

Reading Data Back

Reading data from the GPU to the CPU requires explicit synchronization. The GPU operates asynchronously—you cannot simply read buffer contents while the GPU might still be writing to them.

Step through: Reading GPU data back to CPU

1. Create mappable buffer

const readBuffer = device.createBuffer({
  size: 1024,
  usage: GPUBufferUsage.MAP_READ | GPUBufferUsage.COPY_DST
});

Create a buffer with MAP_READ so we can read it later

Why async? The GPU operates independently. mapAsync returns a Promise that resolves only when all GPU work is complete and the data is ready to read.

The process involves creating a buffer with MAP_READ usage, copying data into it, and then mapping it asynchronously:

// Create a mappable buffer for readback
const readBuffer = device.createBuffer({
  size: dataSize,
  usage: GPUBufferUsage.MAP_READ | GPUBufferUsage.COPY_DST
});
 
// Copy from GPU buffer
const encoder = device.createCommandEncoder();
encoder.copyBufferToBuffer(gpuBuffer, 0, readBuffer, 0, dataSize);
device.queue.submit([encoder.finish()]);
 
// Wait for GPU work to complete
await readBuffer.mapAsync(GPUMapMode.READ);
 
// Read the data
const result = new Float32Array(readBuffer.getMappedRange());
console.log(result);
 
// Release the mapping
readBuffer.unmap();
javascript

mapAsync returns a Promise that resolves when the GPU has finished all pending work involving this buffer. Only then is it safe to read. This asynchronous design prevents the CPU from stalling while waiting for the GPU.

Buffer Layout and Alignment

WGSL and WebGPU have strict alignment requirements. Uniform buffers especially must follow specific rules:

Scalars (f32, i32, u32) require 4-byte alignment. Vec2 types require 8-byte alignment. Vec3 and vec4 types require 16-byte alignment. Mat4x4 requires 16-byte alignment and is stored as four vec4 columns.

Structs in uniform buffers round up to 16-byte alignment. This can create padding:

struct MyUniforms {
  time: f32,        // offset 0, size 4
  // 12 bytes padding
  resolution: vec2f, // offset 16, size 8
  // 8 bytes padding
  matrix: mat4x4f,  // offset 32, size 64
}
wgsl

The total size is 96 bytes, not 76, due to alignment padding. Use tools like the wgsl_offset npm package or manual calculation to ensure your JavaScript data matches the expected layout.

Key Takeaways

  • GPU buffers are separate from CPU memory; data must be explicitly transferred
  • Usage flags declare how a buffer can be used; WebGPU enforces these at runtime
  • writeBuffer() is the simplest upload method for most cases
  • mappedAtCreation allows immediate CPU access during buffer creation
  • Reading data back requires mapAsync() because GPU operations are asynchronous
  • Alignment requirements affect buffer layout—especially for uniform buffers