[ZZ] Order KlayGE game engine Independent Transparency (OIT)

Please indicate the source for the KlayGE game engine , a permanent link to this article http://www.klayge.org/?p=2233

http://dogasshole.iteye.com/blog/1429665

http://www.gdcvault.com/

 

In 2009, when AMD released the HD 5800 also issued a Order Independent Transparency (OIT) of the Demo , but only introduced, not many things can be referenced. On the 2010 GDC OIT GI and a using DX11 linked Lists only gives a more complete algorithm details. Although in recent years there are many new OIT algorithm appear, but as OIT algorithm with the benchmark significance, Per-Pixel Linked Lists is worth it to achieve development version KlayGE in order to do comparison.

algorithm

As the name suggests, Per-Pixel Linked Lists, which means that each pixel on a list, store all the pixel belonging to the fragment. This non-uniform data structure for the GPU is very worse.

Per-Pixel Linked Lists in the linked list requires two additional buffer, fragments called a buffer, is N times needed screen size, is responsible for storing all the fragment; the other is a start offset buffer, and the screen size, each storage a pixel of the first team list. After constructing a data structure stored, the algorithm itself becomes very simple, only two steps:

  1. PS Shading calculated after the color, so that fragments buffer plus a built-in counter, the color space obtained after a deposit into and depth, while updating the corresponding pixel position start offset buffer.
  2. In the post process, PS read from the queue head start offset buffer, thereby indexing the pixel of the entire list, ordered according to the depth, and in order to do alpha blending.

Thus, the algorithm needs only to add a few lines in the original PS in the pipeline, while more than a full-screen post process to complete. All fragment only need to go through PS one, no waste. OIT previously popular method for Depth Peeling respect, in the case of the same number of layers, the results of Per-Pixel Linked Lists identical thereto, and there is no approximation calculation, the theoretical performance is much higher. Because Depth Peeling If you want peeling N layer, all the fragment will generate N times, and discard most of the fragment, you need to peel off the rest of that layer fragment.

The actual test results also confirmed previous analysis, the same result in the NVS 4200M, Per-Pixel Linked Lists can go 62.47FPS, and Depth Peeling can only 46.05FPS.

Per-Pixel Linked Lists

limit

Of course, Per-Pixel Linked Lists can be achieved at least in D3D11 hardware. The hardware does not support PS before writing UAV, no atoms attached to the counter buffer. This So unless implement a method GPGPU software rasterization, or can not circumvent these restrictions.

Another obvious limitation comes from the space occupied. Because they can not know in advance the list will be long, fragments buffer can only apply a relatively large space, may waste a lot, also may overflow. Because the order and chaos fragment is added, not as long ago as Depth Peeling layers. Therefore, the space consumed by this method is not controllable.

In addition to what can be done OIT

Theoretically, all non-OIT approximate method can be used to make voxelization. In the last year of a blog belongs to the SVO future? He mentions how to use the complex Per-Pixel Linked Lists from conservative rasterize, directly to turn into mesh in a pass voxel expression.

Because stores all fragment of the scene, or even directly inside doing ray tracing. But obviously it is better to do so with a set of framework SVO efficient.

 

 

 

 

http://dogasshole.iteye.com/blog/1429665

http://www.gdcvault.com/ here down to.

 

per pixel link list can be done order independent translucency rendering.

Has been sick before we can finally get rid of this stuff up, screenshot in dx11:

 

 

Entangled api means nothing, more meaningful can be done can create a linklist this matter to each pixel:

 

This is too nice, in addition to order independent translucency, a lot of cool algorithms can be done like: translucency deferred lighting and so on.

 

The performance cost estimate will be "value for money" was.

Reproduced in: https: //www.cnblogs.com/kylegui/p/3828411.html

Guess you like

Origin blog.csdn.net/weixin_33895604/article/details/94036746