Storage Textures
Writing to textures from compute shaders
Beyond Reading
So far, textures have been read-only resources. We sample from them, but we cannot write to them. For many GPU workloads—image processing, procedural generation, simulation—we need to create texture data on the GPU itself.
Storage textures unlock this capability. They allow compute shaders to write pixel values directly, treating the texture as a 2D or 3D array of writable memory. Combined with compute shaders, storage textures enable entirely GPU-side image generation and transformation pipelines.
Declaring Storage Textures
Creating a storage texture requires a specific usage flag. When you want a texture that compute shaders can write to, include STORAGE_BINDING in the usage:
const storageTexture = device.createTexture({
size: [256, 256],
format: "rgba8unorm",
usage:
GPUTextureUsage.STORAGE_BINDING | // Can be used as storage in compute
GPUTextureUsage.TEXTURE_BINDING | // Can be sampled as regular texture
GPUTextureUsage.COPY_SRC, // Can copy from it
});The combination of STORAGE_BINDING and TEXTURE_BINDING is common: you write to the texture in a compute pass, then sample from it in a render pass. This two-stage pattern enables compute-to-render workflows.
Storage Textures in WGSL
In your shader, storage textures use a specialized type declaration:
@group(0) @binding(0) var outputTexture: texture_storage_2d<rgba8unorm, write>;
@compute @workgroup_size(8, 8)
fn main(@builtin(global_invocation_id) id: vec3u) {
let color = vec4f(
f32(id.x) / 256.0,
f32(id.y) / 256.0,
0.5,
1.0
);
textureStore(outputTexture, id.xy, color);
}The type texture_storage_2d<format, access> specifies both the format and access mode. The format must match exactly what you created on the JavaScript side. The access mode is either write or read_write (though read_write support varies by format and device).
Interactive: Compute shader writes gradient to texture
Each pattern is generated entirely on the GPU. The compute shader writes directly to a storage texture using textureStore, then that same texture is sampled in the render pass.
Each compute thread writes exactly one pixel. The global_invocation_id provides the pixel coordinates directly—thread (0, 0) writes pixel (0, 0), thread (1, 0) writes pixel (1, 0), and so on. This one-to-one mapping is the natural pattern for image generation.
The textureStore Function
Writing to a storage texture uses textureStore rather than assignment:
textureStore(texture, coordinates, value);The coordinates are integer pixel positions, not normalized UV coordinates. For a 256×256 texture, valid coordinates range from (0, 0) to (255, 255). Writing outside the texture bounds is undefined behavior—typically a silent no-op, but not guaranteed.
The value must match the texture format's channel count. For rgba8unorm, you provide a vec4f. For rg32float, a vec2f. The GPU automatically converts and clamps values according to the format—values outside [0, 1] for normalized formats get clamped, floating-point formats store values directly.
Format Restrictions
Not every texture format supports storage binding. The WebGPU spec defines a limited set of storage-compatible formats:
Interactive: Storage texture format support
| Format | Storage | Channels | Bits/Ch | Notes |
|---|---|---|---|---|
| rgba8unorm | ✓ | 4 | 8 | Standard color, most common |
| rgba8snorm | ✓ | 4 | 8 | Signed normalized [-1, 1] |
| rgba8uint | ✓ | 4 | 8 | Unsigned integers [0, 255] |
| rgba8sint | ✓ | 4 | 8 | Signed integers [-128, 127] |
| rgba16uint | ✓ | 4 | 16 | Larger integer range |
| rgba16sint | ✓ | 4 | 16 | Signed 16-bit integers |
| rgba16float | ✓ | 4 | 16 | HDR color, compute data |
| rgba32uint | ✓ | 4 | 32 | Full integer precision |
| rgba32sint | ✓ | 4 | 32 | Signed 32-bit integers |
| rgba32float | ✓ | 4 | 32 | Scientific compute |
| r32float | ✓ | 1 | 32 | Heightmaps, single values |
| r32uint | ✓ | 1 | 32 | Indices, counters |
| r32sint | ✓ | 1 | 32 | Signed single channel |
| rg32float | ✓ | 2 | 32 | 2D vectors, gradients |
| rg32uint | ✓ | 2 | 32 | 2D integer data |
| rg32sint | ✓ | 2 | 32 | Signed 2D data |
| bgra8unorm | ✗ | 4 | 8 | Display format only |
| depth32float | ✗ | 1 | 32 | Depth buffer only |
| rgb10a2unorm | ✗ | 4 | 10 | HDR display, no storage |
Only formats with storage support can be used as texture_storage_2d in compute shaders
The restrictions exist because storage textures bypass texture caches and samplers—they require direct memory addressing. Some formats have hardware paths that only support filtered sampling, not direct writes.
When choosing a format, consider your data needs. For general RGBA images, rgba8unorm works well. For HDR or computational data, rgba16float or rgba32float provide more precision. For single-channel data like heightmaps, r32float is efficient.
Read-Only vs Storage Textures
The distinction between regular textures and storage textures matters for both performance and capability:
Interactive: Read-only vs storage texture comparison
WGSL Declaration
@group(0) @binding(0) var tex: texture_2d<f32>;Access Method
let color = textureSample(tex, sampler, uv);Regular textures go through samplers for filtered, mipmap-aware reads
Regular textures (declared as texture_2d) go through the texture cache and sampler hardware. They support filtering, mipmaps, and various address modes. But they are read-only in shaders—you can sample, but not modify.
Storage textures bypass the sampler. They provide direct memory access at integer coordinates. No filtering, no mipmaps, but you can write. This makes them ideal for compute outputs but unsuitable for filtered sampling within the same shader.
A common pattern: write to a storage texture in a compute pass, then bind the same texture as a regular texture_2d for sampling in a later render pass. The texture object is the same; only the binding type differs.
Procedural Generation
Storage textures enable fully GPU-side procedural content. Instead of generating a noise texture on the CPU and uploading it, the compute shader creates the data directly:
@group(0) @binding(0) var output: texture_storage_2d<rgba8unorm, write>;
@group(0) @binding(1) var<uniform> params: Params;
fn hash(p: vec2f) -> f32 {
return fract(sin(dot(p, vec2f(127.1, 311.7))) * 43758.5453);
}
fn noise(p: vec2f) -> f32 {
let i = floor(p);
let f = fract(p);
let u = f * f * (3.0 - 2.0 * f);
return mix(
mix(hash(i), hash(i + vec2f(1.0, 0.0)), u.x),
mix(hash(i + vec2f(0.0, 1.0)), hash(i + vec2f(1.0, 1.0)), u.x),
u.y
);
}
@compute @workgroup_size(8, 8)
fn main(@builtin(global_invocation_id) id: vec3u) {
let uv = vec2f(id.xy) / vec2f(textureDimensions(output));
let n = noise(uv * params.scale + params.offset);
textureStore(output, id.xy, vec4f(n, n, n, 1.0));
}Interactive: Procedural noise generation
Fractal Brownian Motion (fBm) noise generated entirely on the GPU. More octaves add finer detail layers. The compute shader writes directly to a storage texture—no CPU-GPU data transfer needed.
The benefits are significant: no CPU-GPU transfer latency, no memory copies, and the GPU's parallelism generates large textures nearly instantly. For 4K procedural textures, the difference between CPU generation with upload versus pure GPU generation can be orders of magnitude.
Image Processing Workflows
Storage textures are central to compute-based image processing. A typical pipeline:
- Load source image into a regular texture
- Bind source as
texture_2dfor sampling - Bind output as
texture_storage_2dfor writing - Dispatch compute shader that reads from source, writes to output
- Optionally chain multiple passes using ping-pong textures
@group(0) @binding(0) var inputTex: texture_2d<f32>;
@group(0) @binding(1) var inputSampler: sampler;
@group(0) @binding(2) var outputTex: texture_storage_2d<rgba8unorm, write>;
@compute @workgroup_size(8, 8)
fn blur(@builtin(global_invocation_id) id: vec3u) {
let dims = vec2f(textureDimensions(inputTex));
let uv = (vec2f(id.xy) + 0.5) / dims;
let texelSize = 1.0 / dims;
var sum = vec4f(0.0);
for (var dy = -2; dy <= 2; dy++) {
for (var dx = -2; dx <= 2; dx++) {
sum += textureSampleLevel(inputTex, inputSampler, uv + vec2f(f32(dx), f32(dy)) * texelSize, 0.0);
}
}
textureStore(outputTex, id.xy, sum / 25.0);
}This pattern—sample from one texture, write to another—keeps the source and destination separate, avoiding read-after-write hazards within a single dispatch.
Synchronization Considerations
When a compute pass writes to a storage texture, subsequent passes must wait for those writes to complete before reading. WebGPU handles this automatically between command encoder submissions, but within a single encoder you need to be aware of the ordering.
Each dispatchWorkgroups call completes before the next begins. If you dispatch a write, then dispatch a read from the same texture, the read will see the written data. However, within a single dispatch, threads cannot reliably read what other threads have written—workgroup-local synchronization exists, but global synchronization does not.
For multi-pass effects, the pattern is simple: dispatch pass A, then dispatch pass B. The GPU ensures A completes before B begins. No explicit barriers needed in WebGPU's command model.
Key Takeaways
- Storage textures allow compute shaders to write pixel data directly using
textureStore - Create with
GPUTextureUsage.STORAGE_BINDINGflag; declare astexture_storage_2d<format, access>in WGSL - Only certain formats support storage binding—consult the format table for your use case
- Storage textures bypass samplers and caches—no filtering, but direct read/write access
- Common pattern: write as storage texture in compute, sample as regular texture in render
- Procedural generation and image processing benefit enormously from GPU-side texture creation
- Dispatches execute in order; no explicit synchronization needed between passes