Convolutions and Filters

Blur, sharpen, and edge detection

When you blur an image, sharpen it, or detect edges, you are performing a convolution. This mathematical operation slides a small grid of numbers (called a kernel) across the image, combining neighboring pixels according to the kernel's weights.

Convolutions are the foundation of image processing. They power everything from Instagram filters to computer vision systems that identify objects in photographs.

What Is a Convolution?

A convolution replaces each pixel with a weighted sum of itself and its neighbors. The kernel (also called a filter or mask) defines those weights.

For a 3x3 kernel, we look at each pixel and its 8 neighbors. We multiply each by the corresponding kernel value and sum the results:

output=i=11j=11kernel[i][j]×pixel[x+i][y+j]\text{output} = \sum_{i=-1}^{1} \sum_{j=-1}^{1} \text{kernel}[i][j] \times \text{pixel}[x+i][y+j]

Convolution Explained

Click different pixels to see how the kernel samples neighbors

The kernel slides across every pixel in the image. At each position, it produces one output value. The result is a new image of the same size, transformed according to the kernel's pattern.

Box Blur

The simplest blur uses equal weights for all neighbors:

Box Blur=19[111111111]\text{Box Blur} = \frac{1}{9} \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix}

Each pixel becomes the average of itself and its 8 neighbors. The result smooths out noise and softens edges.

vec4 boxBlur(sampler2D tex, vec2 uv, vec2 texelSize) {
  vec4 sum = vec4(0.0);
  for (int y = -1; y <= 1; y++) {
    for (int x = -1; x <= 1; x++) {
      sum += texture2D(tex, uv + vec2(x, y) * texelSize);
    }
  }
  return sum / 9.0;
}
glsl

The texelSize is the size of one pixel in UV coordinates: 1.0 / resolution.

Box Blur

Box blur: equal weight for all neighbors

Increasing the kernel size or applying multiple passes creates stronger blur. A 5x5 kernel samples 25 pixels; a 7x7 samples 49. Each larger kernel requires more texture lookups, so performance decreases.

Gaussian Blur

Box blur treats all neighbors equally, which can produce boxy artifacts. Gaussian blur weights pixels by their distance from the center, following a bell curve distribution:

Gaussian 3x3116[121242121]\text{Gaussian 3x3} \approx \frac{1}{16} \begin{bmatrix} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{bmatrix}

The center pixel has the highest weight (4), immediate neighbors have medium weight (2), and corners have the lowest (1). This produces smoother, more natural-looking blur.

For larger Gaussian blurs, a powerful optimization exists: separable convolution. A 2D Gaussian can be decomposed into two 1D passes (horizontal then vertical), reducing a 9x9 kernel from 81 samples to just 18.

Sharpen

Sharpening enhances edges by emphasizing the difference between a pixel and its neighbors:

Sharpen=[010151010]\text{Sharpen} = \begin{bmatrix} 0 & -1 & 0 \\ -1 & 5 & -1 \\ 0 & -1 & 0 \end{bmatrix}

The center weight (5) is greater than 1, amplifying the original pixel. The negative neighbor weights subtract the local average. Edges, where neighboring pixels differ significantly, become more pronounced.

Sharpen Filter

Sharpening amplifies differences between pixels

Too much sharpening creates halos around edges and amplifies noise. The effect works best in moderation.

Edge Detection

Edge detection kernels find boundaries where pixel values change rapidly. The Sobel operators detect edges in specific directions:

Sobel X=[101202101]Sobel Y=[121000121]\text{Sobel X} = \begin{bmatrix} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{bmatrix} \quad \text{Sobel Y} = \begin{bmatrix} -1 & -2 & -1 \\ 0 & 0 & 0 \\ 1 & 2 & 1 \end{bmatrix}

Sobel X detects vertical edges (horizontal changes). Sobel Y detects horizontal edges (vertical changes). Combining them gives the edge magnitude:

float edgeMagnitude = sqrt(sobelX * sobelX + sobelY * sobelY);
glsl

Edge Detection

Combined Sobel: detects edges in all directions

Edge detection is fundamental to computer vision. It reduces an image to its essential structure: where things are, not what color they are.

Other Useful Kernels

Emboss creates a 3D relief effect:

[210111012]\begin{bmatrix} -2 & -1 & 0 \\ -1 & 1 & 1 \\ 0 & 1 & 2 \end{bmatrix}

Outline highlights all edges (similar to Laplacian):

[111181111]\begin{bmatrix} -1 & -1 & -1 \\ -1 & 8 & -1 \\ -1 & -1 & -1 \end{bmatrix}

Identity (does nothing, useful for testing):

[000010000]\begin{bmatrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix}

Custom Kernel Editor

Kernel sum: 1

Edit kernel values or choose a preset

Experiment with different values. The sum of all kernel weights determines brightness: sum to 1 for normal brightness, greater than 1 for brightening, less than 1 for darkening.

Performance Considerations

Each pixel in a convolution requires multiple texture samples. For a 3x3 kernel, that is 9 samples per pixel. For a 1080p image, that is over 18 million texture reads.

Optimization strategies:

  • Separable kernels: Decompose 2D kernels into 1D passes when possible
  • Downsampling: Blur at lower resolution, then upscale
  • Compute shaders: Process multiple pixels in parallel with shared memory
  • Precomputation: For static images, compute once and cache the result

Real-time blur effects often use separable Gaussian blur with multiple passes at progressively lower resolutions (a technique called Kawase blur or similar).

Key Takeaways

  • Convolution replaces each pixel with a weighted sum of its neighbors
  • The kernel defines the weights; its pattern determines the effect
  • Box blur averages all neighbors equally; Gaussian blur weights by distance
  • Sharpening amplifies the center relative to neighbors
  • Sobel operators detect edges in specific directions
  • Kernel weights should sum to 1 for brightness preservation
  • Separable kernels can dramatically improve performance for Gaussian blur
  • Edge detection reduces images to structural information