Skip to content

Instantly share code, notes, and snippets.

@Seas0
Created May 15, 2026 05:40
Show Gist options
  • Select an option

  • Save Seas0/2802e7843e42371450e53727f09d1b0c to your computer and use it in GitHub Desktop.

Select an option

Save Seas0/2802e7843e42371450e53727f09d1b0c to your computer and use it in GitHub Desktop.

Generic G-Buffer ReShade Addon Library Writeup

Scope

This repository implements a ReShade addon library whose exported plugin name is Generic G-Buffer(Normal). The build product is a Windows dynamic library with ReShade addon extensions: .addon32 for Win32 and .addon64 for x64. At runtime, ReShade loads the library into the game process, the addon subscribes to ReShade API events, and the addon exposes a selected game render target to ReShade effect shaders through the texture binding name GATO.

The implementation is intentionally not a shader bytecode patcher. It does not rewrite, disassemble, or replace game shaders. Instead, it acts as a graphics API observer and effect-texture injector:

  1. Track render-target usage across command lists and queues.
  2. Rank or display likely G-buffer resources in an ImGui overlay.
  3. Let the user select a candidate resource.
  4. Create a shader-resource view for the selected render target.
  5. Bind that view into the ReShade effect runtime so an .fx shader can sample the texture.

Build Artifact and Dependencies

源码/normal.vcxproj defines a C++17 Visual Studio project that builds a dynamic library. The project accepts Debug and Release configurations for both Win32 and x64. It selects Visual Studio toolsets v141, v142, or v143 depending on the installed Visual Studio version. The include path expects ReShade headers at ..\..\include and ImGui headers at ..\..\deps\imgui relative to 源码/.

Important build properties:

  • ConfigurationType is DynamicLibrary.
  • TargetName is normal.
  • TargetExt is .addon32 on Win32.
  • TargetExt is .addon64 on x64.
  • LanguageStandard is stdcpp17.
  • ImTextureID=ImU64 is defined so ImGui texture identifiers match ReShade's expected 64-bit handle usage.

Typical build commands from a Developer PowerShell are:

msbuild "源码\normal.vcxproj" /p:Configuration=Release /p:Platform=x64
msbuild "源码\normal.vcxproj" /p:Configuration=Release /p:Platform=Win32

Public Addon Surface

The plugin surface is the standard ReShade addon DLL entry pattern:

  • NAME exports the visible addon name.
  • DESCRIPTION exports the visible addon description.
  • DllMain calls reshade::register_addon(hModule) on process attach.
  • DllMain calls reshade::unregister_addon(hModule) on process detach.

After successful addon registration, register_addon() subscribes to all runtime events needed by the library. The inverse unregister_addon() unsubscribes from the same event families.

The registered event groups are:

  • Device, command-list, command-queue, and effect-runtime lifetime.
  • Resource destruction.
  • Draw, indexed draw, and indirect draw.
  • Render-target binding and render-pass begin.
  • Render-target clear.
  • Command-list reset and command-list execution.
  • Present.
  • ReShade effect begin, finish, and reload.
  • Overlay drawing through ImGui.

This event set is the core of the library. It gives the addon enough visibility to infer how render targets are used before ReShade post-processing begins.

Core Data Model

The implementation uses ReShade private data blocks attached to API objects. Each private-data struct has a UUID so ReShade can store and retrieve it from the corresponding object.

resource_stats

resource_stats records per-resource counters:

  • vertices: accumulated vertex or index count multiplied by instance count.
  • drawcalls: direct and indirect draw count.
  • drawcalls_indirect: subset of draw calls produced by indirect commands.
  • clears: explicit render-target clears plus fullscreen quad draws treated as clear-like passes.

These values are heuristics. They do not prove that a texture is a normal buffer, but they provide useful evidence for distinguishing scene G-buffer passes from late fullscreen post-process targets.

candidate_flag

candidate_flag records two binary signals:

  • rendered_with_depth: the render target was bound while a depth-stencil view was also active.
  • multiple_render_targets: the render target was bound as part of an MRT set.

These signals matter because normal buffers in deferred or hybrid renderers are commonly written during geometry passes that also bind depth and often output multiple G-buffer attachments.

device_tracking

device_tracking is device-level state:

  • queues: graphics queues that should be merged at present time.
  • render_target_resources: last presented frame's render-target statistics.
  • transient_resources: resources that have seen a destroy event.
  • candidates: current-frame candidate flags.
  • last_frame_candidates: previous-frame candidate flags retained for display.

reset_on_present() swaps current-frame candidate flags into last_frame_candidates, clears the public render-target snapshot, and starts a new candidate map for the next frame.

state_tracking

state_tracking is attached to both command lists and command queues. It tracks:

  • Whether the state belongs to a queue (is_queue).
  • The currently bound render targets.
  • Per-render-target counters collected while those targets are active.

The merge() method copies the source's current render-target binding and accumulates all counters into the target. This is how recorded command-list activity is propagated into queue-level state when command lists are executed.

runtime_tracking

runtime_tracking is attached to each ReShade effect_runtime. It holds the shader-injection state:

  • selected_resource_handle: the game resource selected from the overlay.
  • selected_shader_resource: the ReShade resource view created for sampling.
  • last_resource_handle: intended cache key for avoiding redundant SRV rebuilds.
  • srv_desc: view description derived from the selected resource format.

This state is separate from device_tracking because ReShade effect runtimes are the objects that own effect texture bindings.

Render-Target Discovery Pipeline

The discovery logic has five stages.

1. Track Current Render Targets

on_bind_render_targets() handles classic render-target binding. It clears the current target list for the command list, resolves each non-null render-target view to its underlying resource, and stores those resources in current_render_targets.

If a depth-stencil view is present, each target is marked rendered_with_depth. If more than one render target is bound, each target is marked multiple_render_targets.

on_begin_render_pass() performs the same job for render-pass based APIs. This covers modern command models where the active attachments are introduced by a render pass rather than an immediate render-target binding call.

2. Count Draw Activity

on_draw() increments draw count and vertex count for every currently bound render target. It also treats vertices == 6 && instances == 1 as a fullscreen draw and increments clears; this is a practical heuristic for engines that clear or overwrite attachments using a fullscreen triangle/quad-like pass.

on_draw_indexed() reuses the same logic, treating the index count as the vertex count signal.

on_draw_indirect() ignores dispatch commands, then increments direct and indirect draw counters for every current render target.

3. Count Explicit Clears

on_clear_render_target_view() resolves the cleared render-target view back to a resource. It only counts the clear if the cleared resource is currently bound as a render target. This prevents unrelated clear calls from polluting the active pass statistics.

4. Merge Recorded Command Lists

on_execute_primary() merges counters from a recorded command list into the queue-level state when that command list is executed. Immediate command lists are ignored because their work is already queue-level.

on_execute_secondary() merges secondary command-list state into the parent command list. If the secondary list has no explicit target state while the parent has active render targets, the implementation records one synthetic indirect draw. This covers a narrow case where secondary command-list work inherits render targets from the parent.

5. Publish the Frame Snapshot

on_present() is the frame boundary. It creates a temporary queue-state accumulator, merges every tracked graphics queue into it, resets per-queue counters, and publishes the merged render_target_counters into device_tracking::render_target_resources.

The overlay reads this published snapshot rather than live in-flight command state. That design keeps the UI focused on the last completed frame and avoids presenting partially recorded command-list data.

Overlay Selection UI

draw_settings_overlay() is registered as the ReShade overlay callback. It is both the inspection UI and the user-facing selection mechanism.

The overlay pipeline is:

  1. Acquire a shared lock.
  2. Copy the published render-target snapshot into a temporary vector.
  3. Release the lock before calling get_resource_desc() for each resource.
  4. Skip descriptor reads for resources already marked transient.
  5. Sort resources by descending width and height, then by resource handle.
  6. Disable invalid, unknown-format, or non-shader_resource resources.
  7. Auto-select the first usable resource if no resource is currently selected.
  8. Render one checkbox row per candidate.
  9. Display size, format, clear count, draw count, indirect draw count, vertex count, shader-resource capability, depth-binding flag, and MRT flag.

When the user checks a row, the selected resource handle is updated and the shader-resource view descriptor is set to the resource's texture format. After the table is drawn, the function compares selected_resource_handle against last_resource_handle; if they differ, it destroys the previous SRV and creates a new SRV for the selected resource.

This makes the overlay the bridge between analysis and injection. Render-target tracking finds candidates; user selection converts one candidate into a concrete shader resource view.

Shader Injection Path

The injection path has three operations: create a view, bind the view, and manage resource state around ReShade's effect pass.

View Creation

The selected game texture is not directly passed to ReShade effects. The addon first calls device->create_resource_view() with resource_usage::shader_resource. The created resource_view is stored in runtime_tracking::selected_shader_resource.

This requires the underlying resource to have shader-resource usage. The overlay explicitly disables resources without resource_usage::shader_resource, because sampling a render target that was not created with SRV capability is not valid on many graphics APIs.

Effect Texture Binding

update_effect_runtime() performs the actual binding:

runtime->update_texture_bindings(
    "GATO",
    data.selected_shader_resource,
    data.selected_shader_resource);

From an effect author's perspective, this means a ReShade .fx texture using the GATO binding can sample the selected render target. The addon provides the resource; the effect shader decides how to visualize or consume it.

The same view is passed for both arguments. In ReShade's effect API this pattern is used when the texture binding should resolve to the provided view for effect sampling without creating a separate replacement object.

Resource Barriers

on_begin_render_effects() runs immediately before ReShade effects execute. If a selected SRV exists, it inserts a barrier from shader_resource | render_target to shader_resource, then refreshes the GATO binding. This prepares the selected game render target for sampling by the effect pass.

on_finish_render_effects() runs after ReShade effects finish. It inserts a barrier from shader_resource back to render_target | shader_resource. This attempts to restore the resource to a state compatible with continued game or runtime use.

This barrier pair is especially important for explicit APIs such as D3D12 and Vulkan, where sampling from a render-target resource without a correct state transition can produce validation errors, undefined output, or device loss.

Threading and Synchronization Model

The addon uses one global std::shared_mutex.

Shared locks are used when immediate queue-level draw or clear callbacks mutate queue state, and when the overlay snapshots device-level data. Unique locks are used when command-list execution merges recorded state into queues, when present publishes the frame snapshot, and when resource destruction records transient resources.

The implicit model is:

  • Command-list-local state can usually be mutated without locking because a recorded command list is expected to be manipulated by one recording thread at a time.
  • Queue-level state can be touched from callbacks that may race with present or command-list execution, so queue-level access is guarded.
  • The overlay should not hold the lock while querying resource descriptors, because descriptor queries may cross API boundaries and should not block render event collection longer than needed.

This model is pragmatic and low overhead, but it depends on ReShade's callback ordering guarantees and on command-list recording being externally serialized by the graphics API or engine.

Resource Lifetime Handling

on_destroy_resource() marks destroyed resources in transient_resources. Before the overlay queries a descriptor, it checks whether the resource appears in that set. This reduces the chance of dereferencing stale handles from the previous frame's published snapshot.

Runtime teardown destroys the selected shader-resource view before destroying the runtime private data. This prevents the addon-created view from surviving the effect runtime that owns the binding.

The lifetime model is conservative but incomplete. A resource may become unsafe before the destroy callback is observed, depending on API ownership and ReShade's internal wrapping behavior. The code therefore treats transient tracking as a safety filter, not as a proof of validity.

Detection Heuristics and Their Meaning

The addon does not claim to automatically prove which render target is the normal buffer. It surfaces evidence so a user can select the most plausible target.

Useful positive signals include:

  • Screen-sized or near-screen-sized dimensions.
  • Rendered during a pass with a depth-stencil view.
  • Rendered as part of multiple render targets.
  • Significant draw-call and vertex counts.
  • One or a small number of clears per frame.
  • A format commonly used for packed vectors or HDR intermediate data, such as R8G8B8A8, R10G10B10A2, or R16G16B16A16.

Weak or ambiguous signals include:

  • Fullscreen draw counts, because post-processing passes also use fullscreen geometry.
  • Shader-resource capability, because many non-G-buffer render targets are also sampleable.
  • Large dimensions, because final color, history buffers, and post-process intermediates are also often screen-sized.

The strongest practical workflow is to sort candidates by size, inspect MRT and depth flags, then use a debug .fx shader bound to GATO to visually confirm the selected resource.

Compatibility Notes

The implementation is designed to observe both traditional render-target binding and render-pass based APIs, which makes it conceptually applicable to D3D9, D3D11, D3D12, Vulkan, and OpenGL through ReShade's abstraction layer.

The repository notes indicate practical validation on D3D9 and D3D11 paths, with D3D12, Vulkan, and OpenGL less complete or less tested. The most fragile areas for explicit APIs are resource barriers, aliasing, transient resource reuse, and the timing of command-list execution relative to ReShade's effect pass.

Known Implementation Risks

Several implementation details deserve attention before treating the addon as a production-quality library:

  • runtime_tracking::last_resource_handle is compared before recreating the SRV, but the current code does not update it after create_resource_view(). This can cause repeated destruction and recreation of the selected SRV in the overlay callback.
  • on_begin_render_pass() dereferences dsv directly. If ReShade can call this callback with a null depth-stencil descriptor, the code needs a null check.
  • The overlay calls destroy_resource_view() even when the previous selected_shader_resource is zero. This may be harmless under ReShade's API, but relying on null-handle destruction should be confirmed.
  • transient_resources only grows. If handles are reused by the backend, a new valid resource could be incorrectly treated as transient.
  • unregister_addon() unregisters events but does not visibly unregister the overlay callback. If ReShade requires explicit overlay unregistration, this should be paired with register_overlay().
  • The synthetic draw in on_execute_secondary() is a heuristic and may under- or over-count engines that use secondary command lists in unusual ways.
  • The resource barrier restores to render_target | shader_resource, which is a broad state. Some APIs or ReShade backends may require a more precise previous state for correctness.

These are not reasons the design is invalid. They are the main places where the current prototype's assumptions should be verified or hardened.

Validation Workflow

A rigorous validation pass should include:

  1. Build Release|x64 and Release|Win32.
  2. Load the produced addon in a ReShade-enabled game.
  3. Confirm the overlay lists render targets after the first present.
  4. Select a candidate with ShaderResource, RenderedWithDepth, and preferably MultiRenderTargets.
  5. Use a ReShade debug effect that samples GATO.
  6. Confirm that the visualized texture changes when selecting different overlay rows.
  7. Confirm that ReShade effect reload preserves or refreshes the binding through reshade_reloaded_effects.
  8. Repeat across at least one D3D9 32-bit title and one D3D11 64-bit title.
  9. For D3D12 or Vulkan, additionally watch for resource-state errors, flicker, stale texture views, and aliasing artifacts.

Architectural Summary

The library is best understood as a frame-local render-target profiler plus a ReShade effect-resource injector. Its central insight is that normal buffers are hard to identify by format alone, because they use the same generic texture formats as many other render targets. Instead, the addon records how resources are used: whether they are written with depth, whether they belong to MRT sets, how many draw calls write them, and how large they are. It then exposes the candidate set to the user and binds the selected resource into ReShade effects as GATO.

That design keeps the addon generic across engines and APIs. It also honestly preserves the hard part of the problem: final normal-buffer selection still requires either user confirmation or stronger heuristics than the current code implements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment