Skip to content

Instantly share code, notes, and snippets.

@mlugg
Created October 9, 2025 12:22
Show Gist options
  • Select an option

  • Save mlugg/e6bbacaccdff282707e78f8839b5da86 to your computer and use it in GitHub Desktop.

Select an option

Save mlugg/e6bbacaccdff282707e78f8839b5da86 to your computer and use it in GitHub Desktop.
well-commented basic freestanding SelfInfo implementation
//! This whole file is meant to be exposed as `@import("root").debug`. Everything `pub` here is an
//! override which the standard library will use.
/// This is the allocator which `std.debug` will use to allocate debug info. On freestanding you
/// usually just want to use a `std.heap.FixedBufferAllocator` for this---unless you have a proper
/// memory manager working of course!
pub fn getDebugInfoAllocator() Allocator {
const Global = struct {
/// If you see OOM errors in stack traces, try increasing the size of this buffer. This is
/// 16 MiB, which is about 4x what ClashOS has been observed to use. The amount of memory is
/// large because `std.debug.Dwarf` caches all line number information.
var buf: [16 * 1024 * 1024]u8 = undefined;
var fba: std.heap.FixedBufferAllocator = .init(&buf);
};
return Global.fba.allocator();
}
/// Implements the printing of the actual source line in the stack traces. Returning successfully
/// means we printed the line, so the stack tracing logic will print the column indicator ('^')
/// below it. Returning any error means we didn't print the line so the indicator is omitted.
pub fn printLineFromFile(w: *std.Io.Writer, sl: std.debug.SourceLocation) !void {
// You could do what Andrew's blog post does where you `@embedFile` your source file, but for
// the initial implementation, let's just not support this at all.
_ = w;
_ = sl;
return error.FileNotFound;
}
/// This is the API which `std.debug` will call into to query the debug information.
pub const SelfInfo = struct {
/// The standard library's handling of DWARF (the most commonly used debug information format)
/// is pretty handy: it works fine on freestanding provided we just give it the raw data! This
/// field will be `null` at first and we'll populate it once we're queried for anything.
dwarf: ?Dwarf,
pub const init: SelfInfo = .{ .dwarf = null };
pub fn deinit(si: *SelfInfo, gpa: Allocator) void {
if (si.dwarf) |*dwarf| dwarf.deinit(gpa);
}
/// This is asking for the module name to print next to an address if full debug info isn't
/// available ("module" here means the name of the executable or shared library image). It can
/// be whatever you want, doesn't really matter; I'd just set it to the project name.
pub fn getModuleName(_: *SelfInfo, _: Allocator, _: usize) SelfInfoError![]const u8 {
return "clashos";
}
/// This is the actual interesting function, which is asking what the source location of an
/// instruction address is. We just need to initialize the `dwarf` field if not already done,
/// and then call a method on it which finds this for us.
pub fn getSymbol(si: *SelfInfo, gpa: Allocator, address: usize) SelfInfoError!std.debug.Symbol {
if (si.dwarf == null) si.dwarf = try initDwarf(gpa);
return si.dwarf.?.getSymbol(gpa, builtin.target.cpu.arch.endian(), address) catch |err| switch (err) {
error.InvalidDebugInfo,
error.MissingDebugInfo,
error.OutOfMemory,
=> |e| return e,
error.ReadFailed,
error.EndOfStream,
error.Overflow,
error.StreamTooLong,
=> return error.InvalidDebugInfo,
};
}
pub const can_unwind = false;
};
fn initDwarf(gpa: Allocator) SelfInfoError!Dwarf {
// DWARF debugging information is split across different sections in the binary. We need to tell
// `std.debug.Dwarf` where each one of those is; we'll loop over the section names to populate
// each element of this `sections` array.
var dwarf: Dwarf = .{ .sections = undefined };
inline for (@typeInfo(Dwarf.Section.Id).@"enum".fields) |f| {
// `section` is the actual interesting function here; it's defined just below. Don't mind
// this `owned` thing, it's dumb and should be changed (see #25418).
dwarf.sections[f.value] = if (section(f.name)) |bytes| .{
.data = bytes,
.owned = false,
} else null;
}
// `open` is a really badly named method; it basically scans the debug information to compute
// some caches and lookup tables. We need to call it before we actually use `dwarf`.
dwarf.open(gpa, builtin.cpu.arch.endian()) catch |err| switch (err) {
error.InvalidDebugInfo, error.MissingDebugInfo, error.OutOfMemory => |e| return e,
else => return error.InvalidDebugInfo,
};
return dwarf;
}
/// Here, we want to return a slice which refers to the data in the DWARF section named `name`. This
/// is where things get a little tricky. These sections aren't normally made directly available to
/// the binary, because they aren't loaded into memory. To get access to them at runtime, we need to
/// do some trickery in our build logic. Ideally that trickery should just be in the linker script,
/// but linker scripts are really terrible, so we're also going to need some extra stuff. The result
/// will be that for every section we're interested in, two symbols are exposed, which tell us where
/// the data has been put. For instance, for the section named `.debug_info`:
///
/// * The symbol `__debug_info` will be a pointer to the start of the section data
/// * The symbol `__debug_info_len` will be the length in bytes of the section data
///
/// So, this function just needs to look at those two symbols with `@extern`.
fn section(comptime name: []const u8) ?[]const u8 {
const start = @extern([*]u8, .{ .name = "__" ++ name });
const len = @intFromPtr(@extern(*anyopaque, .{ .name = "__" ++ name ++ "_len" }));
if (len == 0) return null;
return start[0..len];
}
const std = @import("std");
const Allocator = std.mem.Allocator;
const Dwarf = std.debug.Dwarf;
const SelfInfoError = std.debug.SelfInfoError;
const builtin = @import("builtin");
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment