Skip to content

Instantly share code, notes, and snippets.

@deanmlittle
Last active February 24, 2025 11:45
Show Gist options
  • Save deanmlittle/d3a8e4c9e4a4929fe3f8cbdba7959859 to your computer and use it in GitHub Desktop.
Save deanmlittle/d3a8e4c9e4a4929fe3f8cbdba7959859 to your computer and use it in GitHub Desktop.
fib.so teardown

Deconstructing fib.so, a minimal sBPF program

For this deconstruction, we will be researching fib.so, a hand-rolled sBPF assembly program that calculates the fibonacci number based upon a u8 input.

Structure of an sBPF program

The structure of an sBPF program has 4 sections:

1. ELF Header

The starting point of the file, describing the overall file format, target environment, and offsets for program and section headers.

2. Program Headers

Define the memory segments and their attributes (readable, writable, executable) for runtime execution.

3. Sections

Contain the actual program content, such as:

  • .text for executable code
  • .data for writable data
  • .rodata for read-only data
4. Section Headers

Metadata explaining each section, used during the linking process, and/or for debugging purposes.

1. ELF header:

Below is the elf header for fib.so

ELFHeader {
   ei_magic: [127, 69, 76, 70,],
   ei_class: 2,
   ei_data: 1,
   ei_version: 1,
   ei_osabi: 0,
   ei_abiversion: 0,
   ei_pad: [0, 0, 0, 0, 0, 0, 0],
   e_type: 3,
   e_machine: 247,
   e_version: 1,
   e_entry: 232,
   e_shoff: 872,
   e_phoff: 64,
   e_flags: 0,
   e_ehsize: 64,
   e_phentsize: 56,
   e_phnum: 3,
   e_shentsize: 64,
   e_shnum: 8,
   e_shstrndx: 7,
}

To create an ELF header of our own from scratch, we must dynamically update the following fields with their correct values/offsets:

  1. e_entry - the offset of our entrypoint, defined by our entrypoint symbol
  2. e_shoff - the offset of our section headers
  3. e_phnum - the number of our program headers (Min 1, typically 3)
  4. e_shnum - the number of section headers - (Min 3 - Null, Progbits, Strtab, typically more)
  5. e_shstrndx - index of Strtab in our section header - (Min 2 - Null/Progbits must preceed Strtab)

2. Program headers:

Program headers define how sections of an ELF file are mapped into memory, as well as access control over these memory regions (e.g., read, write, execute). These headers are used by the system loader to set up the program's memory image when executing the binary.

  1. Readable-Executable Program Header
  2. Read-only Program Header
  3. Dynamic Program Header

NOTE: For a program with no dynamically linked symbols (e.g. syscalls), it is possible to reduce this to just a single Readable-Execute program header.

2.1 Readable-Executable header

The Read-Execute header points to the offset our .text section containing our entrypoint, and also encapsulates our .rodata section:

/// 1. Read-execute program header of fib.so
ProgramHeader {
    p_type: PT_LOAD,
    p_flags: ProgramFlags(5), // Read-Execute
    p_offset: 232,
    p_vaddr: 232,
    p_paddr: 232,
    p_filesz: 232,
    p_memsz: 232,
    p_align: 4096,
},

In this case, the p_filesz and p_memsz matching the offset value of 232 is purely a coincidence. These values are not identical. Our .text section is 200 bytes in length, and our .rodata section is 32 bytes in length, which just so happens to equal our offset value of 232.

2.2 Read-only program header:

If our program contains any dynamically linked symbols, we must include a read-only program header:

/// 1. Readonly program headers of fib.so
let readonly_header = ProgramHeader {
    p_type: PT_LOAD,
    p_flags: ProgramFlags(4), // Readonly
    p_offset: 640,
    p_vaddr: 640,
    p_paddr: 640,
    p_filesz: 168,
    p_memsz: 168,
    p_align: 4096,
}

The offset 640 points to the start of our .dynsym, section defined in our SHT_DYNSYM Section Header. The .dynsym section is a subset of the .symtab symbol table, containing only the symbols needed for dynamic linking.

Dynamic program header:

If our program contains any dynamically linked symbols, we must include a dynamic program header:

let read_only_header = ProgramHeader {
    p_type: PT_DYNAMIC,
    p_flags: ProgramFlags(6), // Read-write
    p_offset: 464,
    p_vaddr: 464,
    p_paddr: 464,
    p_filesz: 176,
    p_memsz: 176,
    p_align: 8,
}

The offset 464 points to the start of our .dynamic, section defined in our SHT_DYNAMIC Section Header. The .dynamic section acts as a metadata table for dynamic linking.

Linking our program headers

We must dynamically update the following fields of each program header based upon their relative offsets and sizes in the binary

  1. p_offset - Offset of the segment in the file image
  2. p_vaddr - Virtual address of the segment
  3. p_paddr - Physical address of the segment
  4. p_filesz - Size in bytes of the segment in the file image
  5. p_memsz - Size in bytes of the segment in the memory

3. Section Headers

The bare minimum for sBPF sections headers is:

  1. null Section Header
  2. .text Section Header
  3. .shstrtab Section Header

This is because null is required by solana-rbpf, .text contains our executable code, and .shstrtab contains the symbol name of our entrypoint.

Our fib.so includes several other headers as detailed below.

3.1 null Header

For absolutely no good reason, our first Section Header must always be null. While not actually a requirement of eBPF, it is a requirement of sBPF, as it inherits the quirks of rBPF which inherits the quirks uBPF which decided to treat the output of the GNU linker, (which happens to include a Null Section Header when it packages ELF binaries, again for no reason) as the "standard" implementation of the eBPF specification.

SectionHeader {
    sh_name: 0, // \0
    sh_type: SHT_NULL,
    sh_flags: 0,
    sh_addr: 0,
    sh_offset: 0,
    sh_size: 0,
    sh_link: 0,
    sh_info: 0,
    sh_addralign: 0,
    sh_entsize: 0,
}

3.2 .text Header

Our second section header must be our .text section. This contains our executable code.

SectionHeader {
    sh_name: 1, // .text
    sh_type: SHT_PROGBITS,
    sh_flags: 6,
    sh_addr: 232,
    sh_offset: 232,
    sh_size: 200,
    sh_link: 0,
    sh_info: 0,
    sh_addralign: 4,
    sh_entsize: 0,
}

TODO: Figure out why sh_addralign is aligned to 4 and not 8 or 1.

3.3 .rodata Header

Our third section header points to our .rodata. This contains the readonly data we use when we invoke _sol_log, namely:

"Sorry, u64 maxes out at F(93) :("

This data would need to be properly aligned to 8 bytes if it was not alreadt 32 bytes in length.

SectionHeader {
    sh_name: 7, // .rodata
    sh_type: SHT_PROGBITS,
    sh_flags: 2,
    sh_addr: 432,
    sh_offset: 432,
    sh_size: 32,
    sh_link: 0,
    sh_info: 0,
    sh_addralign: 1,
    sh_entsize: 0,
}

3.3 .dynamic Header

Our dynamic header points to this:

DT_FLAGS - DF_TEXTREL - indicating text relocation is required  

1e00000000000000 0400000000000000 

DT_REL - offset of .rel.dyn
1100000000000000 f802000000000000 

DT_RELSZ - size of DT_REL table (46 bytes)
1200000000000000 3000000000000000 

DT_RELENT - size of the relocation entry (16 bytes)
1300000000000000 1000000000000000 

DT_RELCOUNT - relative relocation count (1)
faffff6f00000000 0100000000000000 

DT_SYMTAB - offset of .dynsym (640)
0600000000000000 8002000000000000 

DT_SYMENT - size of DT_SYMTAB symbol entry (24 bytes)
0b00000000000000 1800000000000000

DT_STRTAB - offset of the string table .dynstr (736)
0500000000000000 e002000000000000 

DT_STRSZ - size of the DT_STRTAB (24 bytes)
0a00000000000000 1800000000000000 

DT_TEXTREL // One or more relocation entries might request modifications to a non-writable segment.
1600000000000000 0000000000000000 

DT_NULL
0000000000000000 0000000000000000
SectionHeader {
    sh_name: 15, // .dynamic
    sh_type: SHT_DYNAMIC,
    sh_flags: 3,
    sh_addr: 464,
    sh_offset: 464,
    sh_size: 176,
    sh_link: 5,
    sh_info: 0,
    sh_addralign: 8,
    sh_entsize: 16,
}

3.4 .dynsym Header

A dynamic symbol table. Each entry looks like this:

pub struct DynSym {
    pub st_name;  // Symbol name (offset in .dynstr)
    pub st_info: u8;  // Symbol type and binding
    pub st_other: u8; // Symbol visibility
    pub st_shndx: u16; // Section header index related to this object 
    pub st_value: u64; // Symbol value
    pub st_size: u64; // Symbol size
}

impl DymSym {
    pub const fn st_bind(&self) -> u8 {
        self.st_info >> 4
    }

    pub const fn st_type(&self) -> u8 {
        self.st_info & 0x0f
    }

    pub const fn st_visibility(&self) -> u8 {
        self.st_other & 0x03
    }
}

This section of our program contains:

null 00000000 00 00 0000 0000000000000000 0000000000000000

e, STB_GLOBAL, header 1 01000000 10 00 0100 e800000000000000 0000000000000000

sol_log_, STB_GLOBAL 03000000 10 00 0000 0000000000000000 0000000000000000

sol_log_64_, STB_GLOBAL 0c000000 10 00 0000 0000000000000000 0000000000000000

SectionHeader {
    sh_name: 24, // .dynsym
    sh_type: SHT_DYNSYM,
    sh_flags: 2,
    sh_addr: 640,
    sh_offset: 640,
    sh_size: 96,
    sh_link: 5,
    sh_info: 1,
    sh_addralign: 8,
    sh_entsize: 24,
}

3.5 .dynstr Header

This links to all of our dynamic symbols, separated by zero bytes. It contains: e sol_log_ sol_log_64_

SectionHeader {
    sh_name: 32, // .dynstr
    sh_type: SHT_STRTAB,
    sh_flags: 2,
    sh_addr: 736,
    sh_offset: 736,
    sh_size: 24,
    sh_link: 0,
    sh_info: 0,
    sh_addralign: 1,
    sh_entsize: 0,
}

3.6 .rel.dyn Header

The .rel.dyn contains a list of relocations and their types. In this case, we have two types:

  1. 0x08 A relative relocation
  2. 0x0a A syscall relocation

7001000000000000 - Offset 368/0x0107 - points to jump to finalize which calls sol_log_64_

0800000000000000 - R_SBF_64_RELATIVE

9001000000000000 - 400 - points to call sol_log_

0a000000 - "Syscall" reallocation type

02000000 - Points to .dynstr entry 2, sol_log_

a001000000000000 - 416 - points to call sol_log_64_

0a000000 - "Syscall" reallocation type

03000000 - Points to .dynstr entry 3, sol_log_

SectionHeader {
    sh_name: 40,
    sh_type: SHT_REL,
    sh_flags: 2,
    sh_addr: 760,
    sh_offset: 760,
    sh_size: 48,
    sh_link: 4,
    sh_info: 0,
    sh_addralign: 8,
    sh_entsize: 16,
}

3.7 .shstrtab Header

Contains our section header string table, as follows: .text .rodata .dynamic .dynsym .dynstr .rel.dyn .shstrtab

SectionHeader {
    sh_name: 49,
    sh_type: SHT_STRTAB,
    sh_flags: 0,
    sh_addr: 0,
    sh_offset: 808,
    sh_size: 59,
    sh_link: 0,
    sh_info: 0,
    sh_addralign: 1,
    sh_entsize: 0,
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment