Skip to content

Instantly share code, notes, and snippets.

@KirillLykov
Last active March 24, 2025 06:19
Show Gist options
  • Save KirillLykov/c8154c6d7d0db91ea7c1393b08a25b1c to your computer and use it in GitHub Desktop.
Save KirillLykov/c8154c6d7d0db91ea7c1393b08a25b1c to your computer and use it in GitHub Desktop.
Systems performance book by B. Gregg

Summary for Systems Performance book by Brendan Gregg

Book it sel on o'reilly

OS-related things

  • Kernel -- software that manages the system by providing access to hardware and other resources (memory, network stack, scheduling cpu, etc). Runs in priveleged CPU mode -- kernel mode. It uses kernel-space stack.

  • Process -- an OS abstraction for running programs. Runs in user mode with access to kernel mode via system calls. It consists of: memory address space, file descriptors, thread stacks, registers. Process uses user-space stacks. For example, syscalls use a s kernel exception stack associated with each thread.

  • System call -- a protocol for user programs to request kernel to perform privileged operations

  • Trap -- a signal sent to kernel to request a system routine (system calls, processor exceptions, interrups)

  • Hardware interrupt -- a signal sent by devise to kernel, it is a type of trap

  • Socket -- endpoints by which user-level aplications access the network

  • Network interface -- physical device to connect to network

  • IPIs -- inter-processor interrupt which is a way to coordinate processors in multiprocessor system

  • SystemD -- service manager on Linux. Among features -- dependecy-aware service startup, analytics (systemd-analyze)

  • KPTI -- kernel page table isolation patches to mitigate "meltdown" vulnerability. The cost is 0.1% - 6% depending on number of syscalls (context switch is more expensive)

CPU

  • Functional unit CPU component does the following: fetch instruction, decode instruction, execute and (optionaly) memory access, register write-back.

  • user-time/kernel-time ratio is high for computationally intensive applications and low for I/O bound (web server, etc).

  • MMU -- memory managment unit. Performs virtual memory management. Memory is paged and the chache for pages translation is TLB (translation lookaside buffer). If there is tlb miss, MMU goes to Page tables (main memory).

Performance tools

Some tips and tricks

  • futex -- fast user space mutex which is using atomic integer under the hood. Basic operations are WAIT(addr, val) (if *addr == val => current thread sleeps and WAKE(addr, num) (wakes up num threads waiting for the address addr.

  • toplev -- tool to find bottleneck in software https://github.com/andikleen/pmu-tools/wiki/toplev-manual

@KirillLykov
Copy link
Author

KirillLykov commented Sep 15, 2023

Alternative way of proceeding mgeier/rtrb#39

but as of the moment crossbeam-queue is as efficient as his SPSC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment