What the different layout rules are solving is mapping complex (relative to scalars i.e. u32, f32) data structures to memory (a byte array); each with their own space/time tradeoffs.
Data accessed from memory requires knowledge of a byte offset (relative to the start of the memory).
The most important properties of a data structure are alignment and size.
The alignment is the divisor of any byte offset at which the given data structure can reside (i.e. offset % alignment = 0).
Alignment is a power of 2 and for performance reasons is often more than 1 (1 usually also referred to as unaligned access) due to how CPUs/GPUs data accesses are performed at a hardware level.
The SS
constant denotes the inherent size of the (inner) scalar.
The roundUp
function (returns n rounded up to a multiple of k) is defined for positive integers k and n as:
- roundUp(k, n) = ⌈n ÷ k⌉ × k
The po2
function (returns n rounded up to a power of 2) is defined for positive integer n as:
- po2(n) = 2⌈log2(n)⌉
alignOf(E)
is the alignment of E as specified in 1.3 or as overridden through shading language modifiers.
sizeOf(E)
is the size of E as specified in 1.3 or as overridden through shading language modifiers.
alignedSizeOf(E)
is the aligned size of E computed as roundUp(alignOf(E), sizeOf(E))
(also called stride for matrices and arrays and can be overridden through shading language modifiers).
offsetOf(M)
is the offset of member/field M as computed in 1.4/1.5 or as overridden through shading language modifiers.
ty | scalar | std430 | std140 |
---|---|---|---|
scalar S | SS | SS | SS |
vecN<S> | SS | po2(SS * N) | po2(SS * N) |
matCxR<S> | SS | po2(SS * R) | roundUp(16, SS * R) |
array<E, N> | alignOf(E) | alignOf(E) | roundUp(16, alignOf(E)) |
struct with members M1...MN | max(alignOf(M1)...alignOf(MN)) | max(alignOf(M1)...alignOf(MN)) | max(16, alignOf(M1)...alignOf(MN)) |
ty | scalar / std430 / std140 |
---|---|
scalar S | SS |
vecN<S> | SS * N |
matCxR<S> | alignedSizeOf(vecR<S>) * (C-1) + sizeOf(vecR<S>) |
array<E, N> | alignedSizeOf(E) * (N-1) + sizeOf(E) |
struct with members M1...MN | offsetOf(MN) + sizeOf(MN) |
Members of structs are laid out according to the following algorithm.
// Byte offset from the start of the struct
let current_offset = 0;
for member in struct.members {
// Align offset for member
current_offset = roundUp(alignOf(member), current_offset);
// Offset at which the member resides
// This is the return value of offsetOf(member)
member.offset = current_offset;
if is_scalar_layout || member.ty == scalar || member.ty == vector {
current_offset += sizeOf(member);
} else {
// `alignedSizeOf` is only used for matrices, arrays and structs if the layout is std430/std140
current_offset += alignedSizeOf(member);
}
}
Same std140/std430 layout rules with the only change being that struct members of vector type may have scalar alignment. Note that this is also relevant when computing the alignment of the struct as well. See the updated algorithm below.
// Byte offset from the start of the struct
let current_offset = 0;
for member in struct.members {
// Align offset for member (using scalar alignment for vectors)
current_offset = roundUp(alignOf(member), current_offset);
if member.ty == vector {
let align_to_std_alignment = if sizeOf(member.ty) < 16 {
let end_offset = current_offset + sizeOf(member.ty);
// start and end offsets need to lay in the same 16 byte block
floor(current_offset / 16) != floor(end_offset / 16)
} else {
// start offset needs to be aligned to 16 bytes
current_offset % 16 != 0
}
if align_to_std_alignment {
// Align offset, now using std140/std430 alignment
current_offset = roundUp(alignOf(member), current_offset);
}
}
// Offset at which the member resides
// This is the return value of offsetOf(member)
member.offset = current_offset;
if member.ty == scalar || member.ty == vector {
current_offset += sizeOf(member); // Note: HLSL Constant Buffers always use sizeOf.
} else {
// `alignedSizeOf` is only used for matrices, arrays and structs
current_offset += alignedSizeOf(member);
}
}
The default layout is std430. The extra requirements for the uniform address space have to be explicitly met.
- std430; with the caveat that bindings need
alignedSizeOf(T)
bytes rather than justsizeOf(T)
bytes
- std140; with the caveat that bindings need
alignedSizeOf(T)
bytes rather than justsizeOf(T)
bytes and matrices of the formmatCx2
have an alignment of 8 instead of 16
- matrices are column-major
align
andsize
attributes can be used to change the alignment and size of struct members
- std430
- std140
- scalar; via
GL_EXT_scalar_block_layout
SSBOs require OpenGL 4.3 / OpenGL 4.0 + ARB_shader_storage_buffer_object
- std140
- std430 / scalar; via
GL_EXT_scalar_block_layout
- matrices are column-major (can be overriden to be row-major in buffers via
row_major
layout qualifier; added in GLSL 1.4) offset
andalign
layout qualifiers can be used to change the offset and alignment of struct members (added in GLSL 4.4 / GLSL 1.4 +ARB_enhanced_layouts
)
4.1. StorageBuffer Storage Class / PushConstant Storage Class / Uniform Storage Class with BufferBlock Decoration
- std140
- std430; default
- scalar; via
scalarBlockLayout
in Vulkan v1.2 orVK_EXT_scalar_block_layout
- vector-relaxed std140 / std430; since Vulkan v1.1 or via
VK_KHR_relaxed_block_layout
- std140; default
- std430; via
uniformBufferStandardLayout
in Vulkan v1.2 orVK_KHR_uniform_buffer_standard_layout
- scalar; via
scalarBlockLayout
in Vulkan v1.2 orVK_EXT_scalar_block_layout
- vector-relaxed std140 / std430; since Vulkan v1.1 or via
VK_KHR_relaxed_block_layout
Offset
decoration is required on struct membersArrayStride
decoration is required on array typesMatrixStride
and eitherColMajor
orRowMajor
decorations are required for matrices-
Even if scalar alignment is supported, it is generally more performant to use the base alignment.
Vulkan Shader Memory Layout Guide
SPIR-V Specification (Decorations)
SPIR-V Specification (Shader Validation)
- scalar
- vector-relaxed std140; with the caveat that
sizeOf
is always used when computing struct member offsets (see 1.5) - scalar; via
-no-legacy-cbuf-layout
DXC flag
- matrices are column-major in buffers by default (can be overriden via
row_major
modifier), however are row-major in shaders (notation (i.e.float4x3
is a 3 column 4 row matrix), construction and access are all row-major)
HLSL Constant Buffer Packing Rules
DXC HLSL to SPIR-V Feature Mapping
- std430; with the caveat that vector 3's size is 16 instead of 12 (however a packed vector 3 with the alignas specifier = 16 can be used instead)
- provides extra packed vectors (scalar layout)
- matrices are column-major
alignas
specifier can be used to change the alignment (can be applied to structs or struct members)
I believe the size of structs is not required to be a multiple of their alignment.
For the std140 and std430 layouts, the OpenGL specs states:
For the scalar layout, the
GL_EXT_scalar_block_layout
extension specs states:Given this example:
If B is used in a std430 context, it will have this layout:
Here, the next member after the struct is aligned to (at least) the alignment of the struct.
If B is used in a scalar context, it will have this layout:
Here, the alignment requirement for the next member is gone. The size of the struct A has not changed, yet the offset of
m3
is different.