4. WebGPU Storage Buffers

This article is about store buffers, and we pick up where the previous article left off.

Store buffers are similar to uniform buffers in many ways. The example on the previous page would work fine if all we did was change UNIFORM in JavaScript to STORAGE and var in WGSL to var<storage, read> .

In fact, the difference is here, you can have a more suitable name without renaming the variable.

    const staticUniformBuffer = device.createBuffer({
    
    
      label: `static uniforms for obj: ${
      
      i}`,
      size: staticUniformBufferSize,
      // usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST,
      usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST,
    });
 
 
...
 
    const uniformBuffer = device.createBuffer({
    
    
      label: `changing uniforms for obj: ${
      
      i}`,
      size: uniformBufferSize,
      // usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST,
      usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST,
    });

In our WSGL

      @group(0) @binding(0) var<storage, read> ourStruct: OurStruct;
      @group(0) @binding(1) var<storage, read> otherStruct: OtherStruct;

Works fine with no other changes, just like before
insert image description here

Differences between uniform buffers and storage buffers

The main differences between uniform buffers and store buffers are:

  1. For specific use cases, uniform buffers can be faster
    It really depends on the use case. A typical application needs to draw many different things. For a 3D game, the application might draw cars, buildings, rocks, bushes, people, etc... each of which would need to be passed orientation and material properties similar to our example above. In this case, using uniform buffers is the recommended solution.

  2. Store buffers can be much larger than uniform buffers.

    The minimum maximum size of a uniform buffer is 64k
    The
    minimum maximum size of a storage buffer is 128meg
    The minimum maximum size of a storage buffer is 128meg

    By minimum maximum, there is a maximum size a
    buffer of certain type can be. For uniform buffers that maximum size
    is at least 64k. For storage buffers it's at least 128meg. We'll
    cover limits in another article
    . Certain types of buffers can reach a maximum size. For
    uniform buffers with a maximum size of at least 64k. For storage buffers, it is at least 128 megabytes. We will cover the limitations in another article .
    3. Store buffers can be read/written, uniform buffers are read-only We
    saw an example of writing to a store buffer in the compute shader example in the first article .

Given the first two points above, let's take the last example and change it to draw all 100 triangles in one draw call. This is a use case where a storage buffer might be appropriate. I say probably because, WebGPU is similar to other programming languages. There are many ways to achieve the same thing, such as array.forEach vs. for (const elem of array) vs. for (let i = 0; i < array.length; ++i) . Each has its purpose. The same is true for WebGPUs. Everything we try to do has multiple ways of doing it. When it comes to drawing triangles, all WebGPU cares about is that we return the value of builtin(position) from the vertex shader, and the color/value of location(0) from the fragment shader. See 【Note 1】

The first thing we do is change the storage declaration to a runtime sized array.

// @group(0) @binding(0) var<storage, read> ourStruct: OurStruct;
// @group(0) @binding(1) var<storage, read> otherStruct: OtherStruct;
@group(0) @binding(0) var<storage, read> ourStructs: array<OurStruct>;
@group(0) @binding(1) var<storage, read> otherStructs: array<OtherStruct>;

Then we'll change the shader to use these values

@vertex fn vs(
  @builtin(vertex_index) vertexIndex : u32,
  @builtin(instance_index) instanceIndex: u32
) -> @builtin(position) {
    
    
  var pos = array<vec2f, 3>(
    vec2f( 0.0,  0.5),  // top center
    vec2f(-0.5, -0.5),  // bottom left
    vec2f( 0.5, -0.5)   // bottom right
  );
 
  let otherStruct = otherStructs[instanceIndex];
  let ourStruct = ourStructs[instanceIndex];
 
   return vec4f(
     pos[vertexIndex] * otherStruct.scale + ourStruct.offset, 0.0, 1.0);
}

We added a new parameter called instanceIndex to the vertex shader and gave it the @builtin(instance_index) attribute, which means it gets its value from WebGPU for each "instance" it draws. When we call draw we can pass a second parameter of the number of instances, and for each instance drawn, the number of instances being processed will be passed to our function.

Use instanceIndex to get the specified structure element from the structure array.

We also need to some get the color from the correct array element and use it in our fragment shader. The fragment shader doesn't have access to @builtin(instance_index) because that would make no sense. We could pass it as an inter- stage variable but it would be more common to look up the color in the vertex shader and just pass the color.
We also need to get the color from the correct array element and use it in our fragment shader. Fragment shaders can't access @builtin(instance_index) because that doesn't make sense. We could pass it as an inter-stage variable, but it's more common to look up the color in the vertex shader and pass it.

For this we will use another structure, like we did in our article on interstage variables

struct VSOutput {
    
    
  @builtin(position) position: vec4f,
  @location(0) color: vec4f,
}
 
@vertex fn vs(
  @builtin(vertex_index) vertexIndex : u32,
  @builtin(instance_index) instanceIndex: u32
// ) -> @builtin(position) vec4f {
    
    
) -> VSOutput {
    
    
  var pos = array<vec2f, 3>(
    vec2f( 0.0,  0.5),  // top center
    vec2f(-0.5, -0.5),  // bottom left
    vec2f( 0.5, -0.5)   // bottom right
  );
 
  let otherStruct = otherStructs[instanceIndex];
  let ourStruct = ourStructs[instanceIndex];
 
  // return vec4f(
  //  pos[vertexIndex] * otherStruct.scale + ourStruct.offset, 0.0, 1.0);
  var vsOut: VSOutput;
  vsOut.position = vec4f(
      pos[vertexIndex] * otherStruct.scale + ourStruct.offset, 0.0, 1.0);
  vsOut.color = ourStruct.color;
  return vsOut;
}
 
// @fragment fn fs() -> @location(0) vec4f {
    
    
//   return ourStruct.color;
@fragment fn fs(vsOut: VSOutput) -> @location(0) vec4f {
    
    
  return vsOut.color;
}
 

Now that we have modified the WGSL shader, let's update the JavaScript.

code show as below:

  const kNumObjects = 100;
  const objectInfos = [];
 
  // create 2 storage buffers
  const staticUnitSize =
    4 * 4 + // color is 4 32bit floats (4bytes each)
    2 * 4 + // offset is 2 32bit floats (4bytes each)
    2 * 4;  // padding
  const changingUnitSize =
    2 * 4;  // scale is 2 32bit floats (4bytes each)
  const staticStorageBufferSize = staticUnitSize * kNumObjects;
  const changingStorageBufferSize = changingUnitSize * kNumObjects;
 
  const staticStorageBuffer = device.createBuffer({
    
    
    label: 'static storage for objects',
    size: staticStorageBufferSize,
    usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST,
  });
 
  const changingStorageBuffer = device.createBuffer({
    
    
    label: 'changing storage for objects',
    size: changingStorageBufferSize,
    usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST,
  });
 
  // offsets to the various uniform values in float32 indices
  const kColorOffset = 0;
  const kOffsetOffset = 4;
 
  const kScaleOffset = 0;
 
  {
    
    
    const staticStorageValues = new Float32Array(staticStorageBufferSize / 4);
    for (let i = 0; i < kNumObjects; ++i) {
    
    
      const staticOffset = i * (staticUnitSize / 4);
 
      // These are only set once so set them now
      staticStorageValues.set([rand(), rand(), rand(), 1], staticOffset + kColorOffset);        // set the color
      staticStorageValues.set([rand(-0.9, 0.9), rand(-0.9, 0.9)], staticOffset + kOffsetOffset);      // set the offset
 
      objectInfos.push({
    
    
        scale: rand(0.2, 0.5),
      });
    }
    device.queue.writeBuffer(staticStorageBuffer, 0, staticStorageValues);
  }
 
  // a typed array we can use to update the changingStorageBuffer
  const storageValues = new Float32Array(changingStorageBufferSize / 4);
 
  const bindGroup = device.createBindGroup({
    
    
    label: 'bind group for objects',
    layout: pipeline.getBindGroupLayout(0),
    entries: [
      {
    
     binding: 0, resource: {
    
     buffer: staticStorageBuffer }},
      {
    
     binding: 1, resource: {
    
     buffer: changingStorageBuffer }},
    ],
  });

Above we created 2 storage buffers. One for the OurStruct array and one for the OtherStruct array.

We then fill the values ​​of the OurStruct array with offsets and colors, and upload that data to the staticStorageBuffer.

We just create a binding group that references two buffers.

The new rendering code is

  function render() {
    
    
    // Get the current texture from the canvas context and
    // set it as the texture to render to.
    renderPassDescriptor.colorAttachments[0].view =
        context.getCurrentTexture().createView();
 
    const encoder = device.createCommandEncoder();
    const pass = encoder.beginRenderPass(renderPassDescriptor);
    pass.setPipeline(pipeline);
 
    // Set the uniform values in our JavaScript side Float32Array
    const aspect = canvas.width / canvas.height;
 
    //for (const {scale, bindGroup, uniformBuffer, uniformValues} of objectInfos) {
    
    
    //   uniformValues.set([scale / aspect, scale], kScaleOffset); // set the scale
    //    device.queue.writeBuffer(uniformBuffer, 0, uniformValues);
 
    //  pass.setBindGroup(0, bindGroup);
    //    pass.draw(3);  // call our vertex shader 3 times
    // }
 
    // set the scales for each object
    objectInfos.forEach(({
     
     scale}, ndx) => {
    
    
      const offset = ndx * (changingUnitSize / 4);
      storageValues.set([scale / aspect, scale], offset + kScaleOffset); // set the scale
    });
    // upload all scales at once
    device.queue.writeBuffer(changingStorageBuffer, 0, storageValues);
 
    pass.setBindGroup(0, bindGroup);
    pass.draw(3, kNumObjects);  // call our vertex shader 3 times for each instance
 
 
    pass.end();
 
    const commandBuffer = encoder.finish();
    device.queue.submit([commandBuffer]);
  }

The code above will draw kNumObjects instances. For each instance, WebGPU will call the vertex shader 3 times with vertex_index set to 0, 1, 2 and instance_index set from 0 to kNumObjects - 1

insert image description here
We draw 100 triangles, each with a different scale, color and offset. For cases where you want to draw a large number of instances of the same object, this is one way to do it.

Vertex data uses storage buffers

So far we've been hardcoding triangles directly in the shader. One use case for storage buffers is to store vertex data. Just like we indexed the current storage buffer by instance_index in the above example, we can use vertex_index to index another storage buffer to get vertex data.

let us start!

struct OurStruct {
    
    
  color: vec4f,
  offset: vec2f,
};
 
struct OtherStruct {
    
    
  scale: vec2f,
};
 
struct Vertex {
    
    
  position: vec2f,
};
 
struct VSOutput {
    
    
  @builtin(position) position: vec4f,
  @location(0) color: vec4f,
};
 
@group(0) @binding(0) var<storage, read> ourStructs: array<OurStruct>;
@group(0) @binding(1) var<storage, read> otherStructs: array<OtherStruct>;
@group(0) @binding(2) var<storage, read> pos: array<Vertex>;
 
@vertex fn vs(
  @builtin(vertex_index) vertexIndex : u32,
  @builtin(instance_index) instanceIndex: u32
) -> VSOutput {
    
    
  //var pos = array<vec2f, 3>(
  //  vec2f( 0.0,  0.5),  // top center
  //  vec2f(-0.5, -0.5),  // bottom left
  //  vec2f( 0.5, -0.5)   // bottom right
  //);
 
  let otherStruct = otherStructs[instanceIndex];
  let ourStruct = ourStructs[instanceIndex];
 
  var vsOut: VSOutput;
  vsOut.position = vec4f(
      pos[vertexIndex].position * otherStruct.scale + ourStruct.offset, 0.0, 1.0);
  vsOut.color = ourStruct.color;
  return vsOut;
}
 
@fragment fn fs(vsOut: VSOutput) -> @location(0) vec4f {
    
    
  return vsOut.color;
}

Now we need to setup another storage buffer for some vertex data. First let's create a function to generate some vertex data. Probably a circle.

function createCircleVertices({
     
     
  radius = 1,
  numSubdivisions = 24,
  innerRadius = 0,
  startAngle = 0,
  endAngle = Math.PI * 2,
} = {
     
     }) {
    
    
  // 2 triangles per subdivision, 3 verts per tri, 2 values (xy) each.
  const numVertices = numSubdivisions * 3 * 2;
  const vertexData = new Float32Array(numSubdivisions * 2 * 3 * 2);
 
  let offset = 0;
  const addVertex = (x, y) => {
    
    
    vertexData[offset++] = x;
    vertexData[offset++] = y;
  };
 
  // 2 vertices per subdivision
  //
  // 0--1 4
  // | / /|
  // |/ / |
  // 2 3--5
  for (let i = 0; i < numSubdivisions; ++i) {
    
    
    const angle1 = startAngle + (i + 0) * (endAngle - startAngle) / numSubdivisions;
    const angle2 = startAngle + (i + 1) * (endAngle - startAngle) / numSubdivisions;
 
    const c1 = Math.cos(angle1);
    const s1 = Math.sin(angle1);
    const c2 = Math.cos(angle2);
    const s2 = Math.sin(angle2);
 
    // first triangle
    addVertex(c1 * radius, s1 * radius);
    addVertex(c2 * radius, s2 * radius);
    addVertex(c1 * innerRadius, s1 * innerRadius);
 
    // second triangle
    addVertex(c1 * innerRadius, s1 * innerRadius);
    addVertex(c2 * radius, s2 * radius);
    addVertex(c2 * innerRadius, s2 * innerRadius);
  }
 
  return {
    
    
    vertexData,
    numVertices,
  };
}

The above code makes a circle with triangles like this
insert image description here
so we can use it to fill the storage buffer with the vertices of the circle

  // setup a storage buffer with vertex data
  const {
    
     vertexData, numVertices } = createCircleVertices({
    
    
    radius: 0.5,
    innerRadius: 0.25,
  });
  const vertexStorageBuffer = device.createBuffer({
    
    
    label: 'storage buffer vertices',
    size: vertexData.byteLength,
    usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST,
  });
  device.queue.writeBuffer(vertexStorageBuffer, 0, vertexData);
And then we need to add it to our bind group.
然后我们需要将它添加到我们的绑定组中。

  const bindGroup = device.createBindGroup({
    
    
    label: 'bind group for objects',
    layout: pipeline.getBindGroupLayout(0),
    entries: [
      {
    
     binding: 0, resource: {
    
     buffer: staticStorageBuffer }},
      {
    
     binding: 1, resource: {
    
     buffer: changingStorageBuffer }},
      {
    
     binding: 2, resource: {
    
     buffer: vertexStorageBuffer }},
    ],
  });

Finally when rendering, we need to ask for all vertices in the circle to be rendered.

    pass.draw(3, kNumObjects);  // call our vertex shader 3 times for several instances
    pass.draw(numVertices, kNumObjects);

insert image description here
Above we used

struct Vertex {
    
    
  pos: vec2f;
};
 
@group(0) @binding(2) var<storage, read> pos: array<Vertex>;

We can directly use vec2f instead of struct.

@group(0) @binding(2) var<storage, read> pos: vec2f;

But wouldn't it be easier to add vertex data later by making it a struct?

Passing vertices through storage buffers is gaining in popularity. I've been told though that some older devices are slower than the classic approach we'll cover in an upcoming post on vertex buffers .


【Note 1】We can have multiple color attachments and then we'll need to return more colors/value for location(1), location(2), etc... ↩︎We can have multiple color attachments, then we'll need to return more colors/value for location
( 1), location(2) etc return more colors/values... ↩︎

Guess you like

Origin blog.csdn.net/xuejianxinokok/article/details/130841567