4 minute read

中文版

Background. When backporting a Unreal Engine 4.27-style TAA (Temporal Anti-Aliasing) into an older 4.24 mobile rendering pipeline, the TAA pass runs on Vulkan or OpenGL ES via the mobile path. A bug in that setup manifests as follows: after the program has run for some number of frames, the int values in a certain uniform buffer become corrupted; the shader then reads wrong data and the rendered image suddenly goes wrong (e.g. a boolean flag or pixel coordinates may appear as large integers that, when reinterpreted as IEEE754 floats, match the viewport size or its reciprocals). This article summarizes the cause and a practical workaround.

SM5 vs Vulkan/GLES mobile: two code paths

SM5 (D3D11/12):
The HLSL shader is compiled by FXC. All cbuffer parameters (float and int) are packed into a single constant buffer according to the C++ struct layout. The runtime copies the struct bytes directly; there is no type-based splitting.

Vulkan / GLES (mobile preview):
Shaders are built via HLSLCC (HLSL → GLSL → SPIR-V). HLSLCC splits loose global parameters by type into two separate uniform blocks:

layout(binding=0, std140) uniform HLSLCC_CB  { vec4 cu_f[N]; };  // all float parameters
layout(binding=1, std140) uniform HLSLCC_CBi { ivec4 cu_i[3]; }; // all int parameters

The Vulkan RHI builds an EmulatedUBsCopyInfo table at shader compile time: for each parameter it records “copy Y bytes from C++ struct at offset X into the packed float/int buffer at offset Z.” Before each dispatch it calls something like SetEmulatedUniformBufferIntoPacked to perform these copies.

So on mobile Vulkan/GLES, the same logical “cbuffer” is split into a float UBO and an int UBO, and the mapping from C++ struct offsets to those UBO slots is generated by the HLSLCC path.

What goes wrong: int uniforms read float data

The wrong values you see in the int uniform block (e.g. in a debugger or by exporting cu_i) often match, when reinterpreted as IEEE754 floats, the viewport size and its inverses:

Raw cu_i value (as int) Reinterpreted as float
1126170624 160.0
1119092736 90.0
1003277517 1/160
1010174817 1/90

Those correspond to the viewport size vector (W, H, 1/W, 1/H) (e.g. 160×90). So the source byte offset in EmulatedUBsCopyInfo for the int parameters is pointing into the part of the C++ struct where that viewport data lives, not where the actual int parameters live. The bug is wrong offset calculation for the int block, not overwritten memory.

Why the offset is wrong: std140 vs C++ packing

HLSLCC computes the “source” byte offset for each variable based on the HLSL cbuffer layout. In the generated GLSL, std140 rules apply: each element of a float array is aligned to 16 bytes (vec4). So in the HLSLCC view, a float array of 9 elements occupies 9 × 16 = 144 bytes. In the C++ struct, the same array is typically 9 × 4 = 36 bytes. So HLSLCC’s idea of where the next parameter starts (e.g. the first int parameter after that array) is shifted by 144 − 36 = 108 bytes relative to the actual C++ layout. That misalignment makes the copy table assign the int block a source region that overlaps the viewport-size floats (or nearby data), so the GPU ends up reading float bytes into the int uniforms.

Workaround: pass as float, convert in shader

A robust workaround is to avoid integer parameters in that cbuffer for the mobile Vulkan/GLES path: declare the affected parameters as floats (e.g. a boolean as 0.0 or 1.0, pixel coordinates as float2), and at shader entry convert them back to the intended types (e.g. compare with 0.5 for the flag, cast to int2 for coordinates). Then there are no int parameters for HLSLCC to pack, so it does not emit the cu_i / HLSLCC_CBi block and EmulatedUBsCopyInfo has no int entries—the wrong-offset path is never taken. The fix sidesteps the bug by removing integer uniforms from the affected cbuffer rather than fixing HLSLCC’s offset computation itself.

Summary

Aspect Description
Symptom Int TAA parameters (e.g. a boolean or pixel coords) contain wrong values on mobile Vulkan/GLES; they often match viewport float data when reinterpreted.
Cause HLSLCC uses std140 layout (float array = 16 bytes per element) to compute source offsets, while the C++ struct uses 4 bytes per float → int parameters get the wrong source offset and read float data.
Fix Pass those values as floats (and convert to int/bool in the shader) so that the int uniform block is not used and the wrong-offset path is never taken.

The content of this blog post is original to the author. Please indicate the source and include the original link when reproducing it. Thank you for your support and understanding.

Tags: , , ,

Categories:

Updated: