Performance issues on Linux systems can arise from multiple sources—disk I/O bottlenecks, memory pressure, network latency, or file system inefficiencies. To effectively diagnose and resolve these problems, engineers need a comprehensive set of observability tools that provide deep insights into system behavior.
This wiki page serves as a central reference for Linux observability tools that can be used to investigate performance issues from four critical perspectives:
- Disk: Analyze I/O patterns, latency, and throughput to identify storage-related bottlenecks.
- Memory: Monitor physical and virtual memory usage, NUMA behavior, and reclaim activity.
- File System: Trace file operations, cache statistics, and latency sources to optimize file access.