Skip to content

Instantly share code, notes, and snippets.

@teoxoy
Last active August 5, 2025 12:49
Show Gist options
  • Save teoxoy/936891c16c2a3d1c3c5e7204ac6cd76c to your computer and use it in GitHub Desktop.
Save teoxoy/936891c16c2a3d1c3c5e7204ac6cd76c to your computer and use it in GitHub Desktop.
General information and differences between all major shader buffer memory layouts

Shader Buffer Memory Layout Info

1. General

1.1. Background

What the different layout rules are solving is mapping complex (relative to scalars i.e. u32, f32) data structures to memory (a byte array); each with their own space/time tradeoffs.

Data accessed from memory requires knowledge of a byte offset (relative to the start of the memory).

The most important properties of a data structure are alignment and size.

The alignment is the divisor of any byte offset at which the given data structure can reside (i.e. offset % alignment = 0).

Alignment is a power of 2 and for performance reasons is often more than 1 (1 usually also referred to as unaligned access) due to how CPUs/GPUs data accesses are performed at a hardware level.

1.2. Notation

The SS constant denotes the inherent size of the (inner) scalar.

The roundUp function (returns n rounded up to a multiple of k) is defined for positive integers k and n as:

  • roundUp(k, n) = ⌈n ÷ k⌉ × k

The po2 function (returns n rounded up to a power of 2) is defined for positive integer n as:

  • po2(n) = 2⌈log2(n)⌉

alignOf(E) is the alignment of E as specified in 1.3 or as overridden through shading language modifiers.

sizeOf(E) is the size of E as specified in 1.3 or as overridden through shading language modifiers.

alignedSizeOf(E) is the aligned size of E computed as roundUp(alignOf(E), sizeOf(E)) (also called stride for matrices and arrays and can be overridden through shading language modifiers).

offsetOf(M) is the offset of member/field M as computed in 1.4/1.5 or as overridden through shading language modifiers.

1.3. Scalar, std430, std140 layouts

Alignment

ty scalar std430 std140
scalar S SS SS SS
vecN<S> SS po2(SS * N) po2(SS * N)
matCxR<S> SS po2(SS * R) roundUp(16, SS * R)
array<E, N> alignOf(E) alignOf(E) roundUp(16, alignOf(E))
struct with members M1...MN max(alignOf(M1)...alignOf(MN)) max(alignOf(M1)...alignOf(MN)) max(16, alignOf(M1)...alignOf(MN))

Size

ty scalar / std430 / std140
scalar S SS
vecN<S> SS * N
matCxR<S> alignedSizeOf(vecR<S>) * (C-1) + sizeOf(vecR<S>)
array<E, N> alignedSizeOf(E) * (N-1) + sizeOf(E)
struct with members M1...MN offsetOf(MN) + sizeOf(MN)

1.4. Detailed struct layout info

Members of structs are laid out according to the following algorithm.

// Byte offset from the start of the struct
let current_offset = 0;

for member in struct.members {
    // Align offset for member
    current_offset = roundUp(alignOf(member), current_offset);

    // Offset at which the member resides
    // This is the return value of offsetOf(member)
    member.offset = current_offset;

    if is_scalar_layout || member.ty == scalar || member.ty == vector {
        current_offset += sizeOf(member);
    } else {
        // `alignedSizeOf` is only used for matrices, arrays and structs if the layout is std430/std140
        current_offset += alignedSizeOf(member);
    }
}

1.5. Vector-relaxed std140 / std430 layouts

Same std140/std430 layout rules with the only change being that struct members of vector type may have scalar alignment. Note that this is also relevant when computing the alignment of the struct as well. See the updated algorithm below.

// Byte offset from the start of the struct
let current_offset = 0;

for member in struct.members {
    // Align offset for member (using scalar alignment for vectors)
    current_offset = roundUp(alignOf(member), current_offset);

    if member.ty == vector {
        let align_to_std_alignment = if sizeOf(member.ty) < 16 {
            let end_offset = current_offset + sizeOf(member.ty);
            // start and end offsets need to lay in the same 16 byte block
            floor(current_offset / 16) != floor(end_offset / 16)
        } else {
            // start offset needs to be aligned to 16 bytes
            current_offset % 16 != 0
        }
        if align_to_std_alignment {
            // Align offset, now using std140/std430 alignment
            current_offset = roundUp(alignOf(member), current_offset);
        }
    }

    // Offset at which the member resides
    // This is the return value of offsetOf(member)
    member.offset = current_offset;

    if member.ty == scalar || member.ty == vector {
        current_offset += sizeOf(member); // Note: HLSL Constant Buffers always use sizeOf.
    } else {
        // `alignedSizeOf` is only used for matrices, arrays and structs
        current_offset += alignedSizeOf(member);
    }
}

2. WGSL

The default layout is std430. The extra requirements for the uniform address space have to be explicitly met.

2.1. Storage Address Space

  • std430; with the caveat that bindings need alignedSizeOf(T) bytes rather than just sizeOf(T) bytes

2.2. Uniform Address Space

  • std140; with the caveat that bindings need alignedSizeOf(T) bytes rather than just sizeOf(T) bytes and matrices of the form matCx2 have an alignment of 8 instead of 16

2.3. Notes

  • matrices are column-major
  • align and size attributes can be used to change the alignment and size of struct members

2.4. References

WGSL Specification

3. GLSL

3.1. Shader Storage Buffer Object

  • std430
  • std140
  • scalar; via GL_EXT_scalar_block_layout

SSBOs require OpenGL 4.3 / OpenGL 4.0 + ARB_shader_storage_buffer_object

3.2. Uniform Buffer Object

  • std140
  • std430 / scalar; via GL_EXT_scalar_block_layout

3.3. Notes

  • matrices are column-major (can be overriden to be row-major in buffers via row_major layout qualifier; added in GLSL 1.4)
  • offset and align layout qualifiers can be used to change the offset and alignment of struct members (added in GLSL 4.4 / GLSL 1.4 + ARB_enhanced_layouts)

3.4. References

OpenGL Specification

GLSL Specification

4. SPIR-V for Vulkan

4.1. StorageBuffer Storage Class / PushConstant Storage Class / Uniform Storage Class with BufferBlock Decoration

  • std140
  • std430; default
  • scalar; via scalarBlockLayout in Vulkan v1.2 or VK_EXT_scalar_block_layout
  • vector-relaxed std140 / std430; since Vulkan v1.1 or via VK_KHR_relaxed_block_layout

4.2. Uniform Storage Class with Block Decoration

  • std140; default
  • std430; via uniformBufferStandardLayout in Vulkan v1.2 or VK_KHR_uniform_buffer_standard_layout
  • scalar; via scalarBlockLayout in Vulkan v1.2 or VK_EXT_scalar_block_layout
  • vector-relaxed std140 / std430; since Vulkan v1.1 or via VK_KHR_relaxed_block_layout

4.3. Notes

  • Offset decoration is required on struct members
  • ArrayStride decoration is required on array types
  • MatrixStride and either ColMajor or RowMajor decorations are required for matrices
  • Even if scalar alignment is supported, it is generally more performant to use the base alignment.

4.2. References

Vulkan Specification

Vulkan Shader Memory Layout Guide

SPIR-V Specification (Decorations)

SPIR-V Specification (Shader Validation)

5. HLSL

5.1. Structured Buffer

  • scalar

5.2. Constant Buffer

  • vector-relaxed std140; with the caveat that sizeOf is always used when computing struct member offsets (see 1.5)
  • scalar; via -no-legacy-cbuf-layout DXC flag

5.3. Notes

  • matrices are column-major in buffers by default (can be overriden via row_major modifier), however are row-major in shaders (notation (i.e. float4x3 is a 3 column 4 row matrix), construction and access are all row-major)

5.4. References

DXC Buffer Packing Wiki

HLSL Constant Buffer Packing Rules

DXC HLSL to SPIR-V Feature Mapping

6. MSL

6.1. Device / Constant Address Space

  • std430; with the caveat that vector 3's size is 16 instead of 12 (however a packed vector 3 with the alignas specifier = 16 can be used instead)

6.2. Notes

  • provides extra packed vectors (scalar layout)
  • matrices are column-major
  • alignas specifier can be used to change the alignment (can be applied to structs or struct members)

6.3. References

MSL Specification

@maxime-modulopi
Copy link

maxime-modulopi commented Mar 14, 2025

I believe the size of structs is not required to be a multiple of their alignment.

For the std140 and std430 layouts, the OpenGL specs states:

If the member is a structure, the base alignment of the structure is N, where N is the largest base alignment value of any of its members, and rounded up to the base alignment of a vec4. The individual members of this substructure are then assigned offsets by applying this set of rules recursively, where the base offset of the first member of the sub-structure is equal to the aligned offset of the structure. The structure may have padding at the end; the base offset of the member following the sub-structure is rounded up to the next multiple of the base alignment of the structure.

For the scalar layout, the GL_EXT_scalar_block_layout extension specs states:

If the member is a structure, the base alignment of the structure is N, where N is the largest base alignment value of any of its members. The individual members of this substructure are then assigned offsets by applying this set of rules recursively, where the base offset of the first member of the sub-structure is equal to the aligned offset of the structure.

Given this example:

struct A
{
    uint64_t m1;
    uint32_t m2;
};

struct B
{
    A a;
    uint32_t m3;
};

If B is used in a std430 context, it will have this layout:

struct
{
    struct
    {
        uint64_t m1;
        uint32_t m2;
    }
    // 4-byte padding
    uint32_t m3;
}

Here, the next member after the struct is aligned to (at least) the alignment of the struct.

If B is used in a scalar context, it will have this layout:

struct
{
    struct
    {
        uint64_t m1;
        uint32_t m2;
    }
    uint32_t m3;
}

Here, the alignment requirement for the next member is gone. The size of the struct A has not changed, yet the offset of m3 is different.

@teoxoy
Copy link
Author

teoxoy commented Mar 21, 2025

I see what you mean, I think in practice the only difference is that bound buffers can be slightly smaller in cases where the offset of the last element + its size is not a multiple of the struct's alignment.

It seems GL, Vulkan, Metal and D3D12 don't require bound ranges of buffers to be at least as big as the struct declaration in the shader, only WebGPU has this requirement and I guess that's why all the native APIs only talk about alignment and offsets.

I will see how to update the the md to best reflect this.

@maxime-modulopi
Copy link

There is a bigger difference: scalar layout does not actually match C/C++ layout.

In C/C++, the layout would be:

struct
{
    struct
    {
        uint64_t m1;
        uint32_t m2;
        // 4-byte padding so that sizeof(A) % alignof(A) == 0
    }
    uint32_t m3;
}

In GLSL with scalar layout:

struct
{
    struct
    {
        uint64_t m1;
        uint32_t m2;
    }
    uint32_t m3;
}

m3 does not have the same offset in these cases.
(I actually spent a few hours debugging my code because of this issue as I was assuming scalar == C layout).

@teoxoy
Copy link
Author

teoxoy commented Aug 5, 2025

You are right, thanks for bringing this up! I updated the gist to reflect this. I think it should be correct now but let me know if you spot any issue.

Looking at all of this again, it seems the best way to match C layout would be to either use the scalar layout and pad end of structs manually or use std430 and increase alignment via alignas of structs representing vectors and matrices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment