Three things I did to speed up my WebGL code

I made this visualisation to illustrate what happens during the day of the September equinox. On this day the sun rises almost directly to the east and sets almost directly to the west. But the thing I was most proud of is that this isn’t a FFMPEG compilation of screenshots but an actual recording of WebGL code executing in the browser on a retina screen.

How I got here #

It’s been a learning journey to remove jitter/jank/lag from ShadeMap. The first version used +/-5 minute increment buttons because using a slider quickly caused the app to freeze. I’ve been able to improve the rendering performance by a factor of 20 over the last 6 months. Today, I’m even offering a High Quality retina option, which means crunching 4 times as many pixels as before.

I was a complete beginner at WebGL and at first I used the gpu.js library to write my rendering code in plain Javascript and transpile it to WebGL. I still think this was a good choice because I had a working MVP online the first month, but it took a long time to understand the generated WebGL and why it was so slow. These are three major performance mistakes I missed because I did not understand how WebGL worked.

Keep a reference to your uniforms #

WebGL provides a bridge between Javascript and GL Shader Language (GLSL). It’s similar to how React Native allows you to call native iOS libraries via Javascript. In order to set a value in GLSL, you need to look up the variable (called a uniform) by name and then assign it a value. Here is an inefficient way to do it:

for (let j = 0; j < 100; j++) {
  // get a reference to GLSL variable
  const glWidth = gl.getUniformLocation(program, "width");
  // set the value
  gl.uniform1f(glWidth, 600);
}

Looking up the GLSL variable using gl.getUniformLocation is an expensive operation. You should only do this lookup once for every variable and then reuse that reference over the course of your program:

// only get the uniform location once and keep reusing glWidth
const glWidth = gl.getUniformLocation(program, "width");
for (let j = 0; j < 100; j++) {
  gl.uniform1f(glWidth, 600);
}

A break statement in WebGL acts like a continue #

CORRECTION: a user on Reddit has pointed out that break does work as expected in WebGL. I am also having trouble reproducing the behaviour I observed as well. Will update with more information.

For each pixel on the ShadeMap, I draw a line towards the sun. If the line hits something before it reaches the sun, it’s in the shade. If it doesn’t hit anything it’s in the sun. One optimisation is to check if the line you’re drawing is already above the highest point of the map. If it is, you can stop checking for collisions. Here’s some pseudo code.

const int LOOP_MAX = 1000;
for (int j = 0; j < LOOP_MAX; j++) {

  // OMITTED: expensive calculations of earth curvature and texture 
  // value lookups

  if (z > highestPoint) {
    break;
  }
}

How many times will this loop run if z > highestPoint on the first iteration? The answer: 1000 times. If you’re surprised by this, I was too and for months I was doing a lot of unnecessary and expensive computations on the GPU. Because of how GLSL compiles the instructions for the hardware, it cannot exit early. The break keyword will stop the current iteration and go back to the beginning of the loop but it will not break the looping itself. Much like how continue works in most other languages.

What I did instead was move the break statement to the top of my loop. This way, the break happens before any expensive calculations execute:

const int LOOP_MAX = 1000;
int highestPointExceeded = 1000;
for (int j = 0; j < LOOP_MAX; j++) {
  if (j > highestPointExceededFlag) {
    break;
  }

  // OMITTED: expensive calculations of earth curvature and texture 
  // value lookups

  if (z > highestPoint) {
    highestPointExceededFlag = 0;
  }
}

Only render what your user sees #

Tiled web map Stevage

Example of PNG image tiles

ShadeMap elevation data is downloaded in 256x256 PNG image tiles. When the map first loads, it calculates what tiles it needs, downloads and stitches them together to create one large texture. My laptop resolution is 1280x800 and because this is not an even multiple of 256, the stitched tiles will be 1280x1024 with some part of the tiles getting cut off at the top and bottom of the screen.

Because I had to do value lookups on the 1280x1024 stitched tile texture, I decided that I would render the shade to a canvas that was the same dimensions as the stitched tiles. It made the vertex shader very simple, but I was also calculating shade for pixels that were not on the screen and wasting a lot of GPU cycles.

attribute vec2 aPos;
varying vec2 vTexCoord;

main (void) {
  gl_Position = vec4(aPos, 0, 1);
  // one-to-one mapping between vertex shader and fragment shader
  vTexCoord = vec2(gl_Position * 0.5 + 0.5);
}

Eventually I decided that I would do no extra work but what was required to render on the screen. I made the canvas exactly the size of the viewport and transformed the values of the varying so it would work with a elevation tile texture that was larger than the viewport. This code was more complex, but the speedup was worth it.

attribute vec2 aPos;
varying vec2 vTexCoord;

uniform float user_xStart;
uniform float user_yStart;
uniform float user_xEnd;
uniform float user_yEnd;

void main(void) {
    gl_Position = vec4(aPos, 0, 1);

    // do not calculate shade for heightmap pixels that are outside the viewport
    // heightmap dimensions are bigger than viewport dimensions
    vec4 textureSpace = gl_Position * 0.5 + 0.5;
    vTexCoord = vec2((user_xEnd - user_xStart) * textureSpace.x + user_xStart, (user_yEnd - user_yStart) * (1.0 - textureSpace.y) + user_yStart);
}

Thanks for reading. If I missed anything or made mistakes please let me know.

 
10
Kudos
 
10
Kudos

Now read this

Debugging load cells and HX711

I’m working on a system to determine how long it takes to air dry laundry. As the fabric dries, it becomes lighter. The idea is to measure the weight of drying laundry and when it stops decreasing you know it is dry. Enter: the load... Continue →