UE4 To Support Framebuffer Fetch For OpenGL ES3.1

1. IntroduceAs we all know, most of mobile devices are base on tile-based GPU architecture. For these kinds of GPU, There's no extra performance to fetch frame buffer directly in pixel shader. Correspondingly, OpenGL ES add an extension 'EXT_shader_framebuffer_fetch'. Apple present a very detail description and sample code to introduce this feature.https://developer.apple.com/library/content/documentation/3DDrawing/Conceptual/OpenGLES_ProgrammingGuide/BestPracticesforShaders/BestPracticesforShaders.html #version 300 es #extension GL_EXT_shader_framebuffer_fetch : require layout(location = 0) inout lowp vec4 destColor; void main() { lowp float luminance = dot(vec3(0.3, 0.59, 0.11), destColor.rgb); destColor.rgb = vec3(luminance); }2. UE4 Shader CompilerUE4 Almost store all of shader related document under 'Graphics Programming'.1) Shader, Material And Shader CacheIn generally, UE4 has two types of shader: global shader and Material. Global shaders are shaders defined low level and be directly used on engine level. For example, shadow filtering, post processing. Only one shader of any given global shader type exists in memory.Materials are defined by a set of states that control how material is rendered(blend mode, two sided, etc) and a set of material inputs that control how the material interacts with the various rendering passes(BaseColor, Roughness, Normal, etc).More details in Shader Development.As we all know, OpenGL/OpenGL ES use GLSL to programming GPU. The shader should be compiled at runtime before using it instead of HLSL using compiled binary shader. So, UE4 support cache mechanism which will reduce shader hitching in-game.https://docs.unrealengine.com/latest/INT/Programming/Rendering/FShaderCache/index.html.2) HLSL Cross CompilerUE4 support most of main stream graphic API: DirectX, OpenGL/OpenGL ES, Metal and Vulkan. But, the shader source codes are all written by HLSL. Before running shaders, it will be translated into different shader language(GLSL for example). Then, using the new generated platform related shader code to compile final shader code with corresponding tools.https://docs.unrealengine.com/latest/INT/Programming/Rendering/ShaderDevelopment/HLSLCrossCompiler/index.htmlThe library is largely based on the GLSL compiler from Mesa. The frontend has been heavily rewritten to parse HLSL and generate Mesa IR from the HLSL Abstract Syntax Tree (AST). The library leverages Mesa's IR optimization to simplify the code and finally generates GLSL source code from the Mesa IR. The GLSL generation is based on the work in glsl-optimizer.https://github.com/aras-p/glsl-optimizer3) Shader Compiler DebugUE4 has a separated tool(Shader-Compiler.exe) to compile HLSL to different platform shader language. It will be spawned as a standalone process. So, we need some configuration to start shader compiler debug.In these document, there's a detail steps to show how to 'Debugging the Shader Compiling Process':https://www.unrealengine.com/en-US/blog/debugging-the-shader-compiling-processAdditionally, some part of official document contains information about this topic:https://docs.unrealengine.com/latest/INT/Programming/Rendering/ShaderDevelopment/index.html3. Extend UE4 to Support Framebuffer FetchTo support frame buffer fetch extension for UE4 engine, we only need to custimize OpenGL shader compiler. In OpenGLShaderCompiler.cpp, FOpenGLFrontend::CompileShader() function, FHlslCrossCompilerContext CrossCompilerContext(CCFlags, Frequency, HlslCompilerTarget); if (CrossCompilerContext.Init(TCHAR_TO_ANSI(*Input.SourceFilename), LanguageSpec)) { Result = CrossCompilerContext.Run( TCHAR_TO_ANSI(*PreprocessedShader), TCHAR_TO_ANSI(*Input.EntryPointName), BackEnd, &GlslShaderSource, &ErrorLog ) ? 1 : 0; }Cross compiler tool will generate a ready-to-final optimization source code. At this point we can hijack compiled shader source code with our owen shader source code.// When default UE4 shader compiler compiled shader file correctly.if (GlslShaderSource && Result != 0){ if (HSF_PixelShader == Frequency && HCT_FeatureLevelES3_1 == HlslCompilerTarget && GLSL_ES3_1_ANDROID==Version ) { FString HijackShaderPath = FPaths::EngineDir() + TEXT("Shaders/Hijack/GLSL_ES3_1_ANDROID/") + FPaths::GetBaseFilename(Input.SourceFilename) + TEXT("_ps.glsl"); TUniquePtr Reader(IFileManager::Get().CreateFileReader(HijackShaderPath.GetCharArray().GetData())); if (Reader) { //output original UE4 code for later usage. int32 GlslSourceLen = GlslShaderSource ? FCStringAnsi::Strlen(GlslShaderSource) : 0; FArchive* FileWriter = IFileManager::Get().CreateFileWriter(*(FPaths::EngineDir() + TEXT("Shaders/Hijack/GLSL_ES3_1_ANDROID/") + FPaths::GetBaseFilename(Input.SourceFilename) + TEXT("_ps_ue4.glsl"))); if (FileWriter) { FileWriter->Serialize(GlslShaderSource, GlslSourceLen + 1); FileWriter->Close(); delete FileWriter; } // Read hijacked shader code. int32 Size = Reader->TotalSize(); if (Size > 0) { free(GlslShaderSource); GlslShaderSource = (char*)malloc(Size+1); Reader->Serialize(GlslShaderSource, Size); GlslShaderSource[Size] = '\0'; bool Success = Reader->Close(); Reader = nullptr; } } }}In above code, if the shader is openGL ES3.1 fragment shader, then check %EnginePath%shaders/Hijack/GLSL_ES3_1_ANDROID folder if the same name framgment shader file exist. That file will replace the auto-generated shader code. At same time, to more easy to understand what shader code that UE4 has generated. It will output auto-generate shader code to a file first. Graphic programmers can modify that file instead of write all code completely. So, normally, it need two passes to run shader compiler.BTW, we need add extension declare in GLSLToDeviceCompatibleGLSL() in OpenGLShaders.cpp file. AppendCString(GlslCode, "#extension GL_EXT_shader_framebuffer_fetch : require\n");4. An Example: Tone-mapping Post-Process// Compiled by HLSLCC 0.66// @Samplers: ColorGradingDirectLUT(0:1[ColorGradingDirectLUTSampler])#version 310 es#ifdef GL_EXT_gpu_shader5#extension GL_EXT_gpu_shader5 : enable#endif#ifdef GL_EXT_texture_buffer#extension GL_EXT_texture_buffer : enable#endif#ifdef GL_EXT_texture_cube_map_array#extension GL_EXT_texture_cube_map_array : enable#endif#ifdef GL_EXT_shader_io_blocks#extension GL_EXT_shader_io_blocks : enable#endifprecision mediump float;precision mediump int;#ifndef DONTEMITSAMPLERDEFAULTPRECISIONprecision mediump sampler2D;precision mediump samplerCube;#endif#ifdef TEXCOORDPRECISIONWORKAROUNDvec4 texture2DTexCoordPrecisionWorkaround(sampler2D p, vec2 tcoord){return texture2D(p, tcoord);}#define texture2D texture2DTexCoordPrecisionWorkaround#endifprecision highp float;precision highp int;void compiler_internal_AdjustInputSemantic(inout vec4 TempVariable){#if HLSLCC_DX11ClipSpaceTempVariable.y = -TempVariable.y;TempVariable.z = ( TempVariable.z + TempVariable.w ) / 2.0;#endif}void compiler_internal_AdjustOutputSemantic(inout vec4 Src){#if HLSLCC_DX11ClipSpaceSrc.y = -Src.y;Src.z = ( 2.0 * Src.z ) - Src.w;#endif}bool compiler_internal_AdjustIsFrontFacing(bool isFrontFacing){#if HLSLCC_DX11ClipSpacereturn !isFrontFacing;#elsereturn isFrontFacing;#endif}uniform highp sampler2D ps0;INTERFACE_LOCATION(0) inout vec4 out_Target0;void main(){vec4 v0 = out_Target0.xyzw;vec4 v1;v1.xyzw = v0;highp vec3 v2;highp vec3 v3;v3.xyz = ((v0.xyz*vec3(9.375000e-01,9.375000e-01,9.375000e-01))+vec3(3.125000e-02,3.125000e-02,3.125000e-02));v2.xyz = v3;highp float f4;f4 = floor(((v2.z*1.600000e+01)+-5.000000e-01));highp float f5;f5 = ((v2.x+f4)/1.600000e+01);highp vec2 v6;v6.x = f5;v6.y = v2.y;highp vec2 v7;v7.x = (f5+6.250000e-02);v7.y = v2.y;float h8;h8 = (((v2.z*1.600000e+01)+-5.000000e-01)+(-f4));v1.xyz = mix(texture(ps0,v6),texture(ps0,v7),vec4(h8)).xyz;out_Target0.xyzw = v1;}5. ConclusionCurrent hijack method works, but isn't so elegent. A better method is to change hlsl cross compiler to support it. In this method, we can event add this features for materials. Then a lot of high performance features available. Such as mask opacity material instead of triditional transiparent mask material.

猜你喜欢

转载自blog.csdn.net/cnjet/article/details/78294709