Skip to content

Instantly share code, notes, and snippets.

@erhangundogan
Last active May 14, 2025 10:18
Show Gist options
  • Save erhangundogan/f527dfc898b8911b7efc78117d88ffdb to your computer and use it in GitHub Desktop.
Save erhangundogan/f527dfc898b8911b7efc78117d88ffdb to your computer and use it in GitHub Desktop.

There are numerous open-source tools available for troubleshooting operating systems across various aspects like running processes, open ports, network connections, hardware usage (CPU, memory, disk, network), and logs. Below is a categorized list of popular open-source tools that can help with these tasks, primarily focusing on Linux/Unix systems, but some are cross-platform and work on Windows or macOS as well.

1. Running Processes

  • top: Real-time system-monitoring tool for Unix systems, displays CPU and memory usage by processes.
    • Alternative: htop (enhanced, user-friendly version with a colorful interface and process management features).
  • ps: Displays a snapshot of current processes, highly customizable for filtering and formatting output.
  • pidstat (part of sysstat): Monitors individual process statistics like CPU, memory, and I/O usage over time.
  • Glances: Cross-platform system-monitoring tool that provides a comprehensive view of processes, CPU, memory, and more in a single interface.

2. Open Ports

  • netstat (part of net-tools): Displays open ports, active connections, and routing tables (though somewhat deprecated in favor of newer tools).
  • ss: Modern replacement for netstat, faster and provides detailed socket statistics, including open ports and listening services.
  • nmap: Network exploration tool and port scanner, useful for discovering open ports and services on local or remote systems.
  • lsof: Lists open files, including network sockets, to identify which processes are using specific ports.

3. Open Network Connections

  • netstat and ss: As mentioned, both show active network connections, including TCP/UDP states and associated processes.
  • iftop: Displays real-time network bandwidth usage by connection, showing active network connections and their data rates.
  • nload: Monitors network traffic and bandwidth usage, useful for identifying high-traffic connections.
  • Wireshark: Open-source packet analyzer for detailed inspection of network connections and traffic (cross-platform).
  • tcpdump: Command-line packet analyzer for capturing and analyzing network traffic, useful for troubleshooting connection issues.

4. Hardware Usage (CPU, Memory, Disk, Network)

  • CPU Usage:
    • top/htop: Real-time CPU usage per process and system-wide.
    • mpstat (sysstat): Detailed CPU usage statistics, including per-core metrics.
    • vmstat: Reports virtual memory and CPU usage statistics.
  • Memory Usage:
    • free: Displays memory usage (total, used, free, and cached).
    • vmstat: Provides memory statistics alongside CPU.
    • htop/Glances: Visualizes memory usage by processes.
  • Disk Usage:
    • df: Reports disk space usage for mounted filesystems.
    • du: Estimates file and directory space usage.
    • iotop: Monitors disk I/O usage by processes in real-time.
    • iostat (sysstat): Provides disk I/O and CPU statistics.
  • Network Usage:
    • iftop/nload: Real-time network bandwidth monitoring.
    • vnstat: Lightweight network traffic monitor with historical data tracking.
    • bmon: Bandwidth monitor for network interfaces.

5. Logs

  • journalctl (systemd): Queries and displays logs from the systemd journal, widely used in modern Linux distributions.
  • tail: Monitors log files in real-time (e.g., tail -f /var/log/syslog).
  • less/more: Paginates and searches through large log files.
  • grep: Filters log files for specific patterns or errors (e.g., grep "error" /var/log/messages).
  • logrotate: Manages and rotates log files to prevent disk space issues (not for analysis but for log maintenance).
  • rsyslog/syslog-ng: Centralized log management systems for collecting, filtering, and storing logs.
  • ELK Stack (Elasticsearch, Logstash, Kibana): Open-source suite for centralized log aggregation, analysis, and visualization (more complex setup).

6. Comprehensive Monitoring Tools

  • Nagios Core: Open-source monitoring system for tracking system resources, network services, and logs (requires setup).
  • Zabbix: Enterprise-grade monitoring tool for hardware, network, and application performance with log monitoring capabilities.
  • Prometheus: Time-series-based monitoring tool with powerful querying, often paired with Grafana for visualizing CPU, memory, disk, and network metrics.
  • Cockpit: Web-based interface for Linux server management, including process, network, storage, and log monitoring.

7. Cross-Platform Tools

  • Wireshark: Packet analysis for network troubleshooting (Linux, Windows, macOS).
  • Glances: System monitoring with a web interface (Linux, Windows, macOS).
  • Sysdig: System-level monitoring and troubleshooting with detailed insights into processes, network, and disk (Linux, with some Windows support).
  • Process Explorer (Windows-specific, open-source alternative to Task Manager): Detailed process and resource usage for Windows systems.

Installation Notes

  • Most of these tools are available in default package repositories for Linux distributions (e.g., apt install htop on Debian/Ubuntu, yum install htop on CentOS, or pacman -S htop on Arch).
  • For cross-platform tools like Wireshark or Glances, check their official websites or package managers (e.g., Homebrew on macOS, Chocolatey on Windows).
  • Some tools like ELK Stack, Nagios, or Zabbix require more complex setup but offer powerful features for large-scale systems.

Recommendations

  • For quick troubleshooting: Use htop, ss, iotop, and journalctl for a lightweight, real-time overview.
  • For network issues: Combine nmap, Wireshark, and tcpdump for detailed analysis.
  • For comprehensive monitoring: Set up Prometheus with Grafana or Zabbix for long-term insights.
  • For logs: Start with journalctl or grep for quick checks, and scale to ELK Stack for centralized log management.

There are open-source tools that combine multiple troubleshooting capabilities (e.g., monitoring processes, ports, network connections, hardware usage, and logs) into a single solution, allowing on-demand execution to diagnose and report system issues. These tools typically provide a unified interface or scriptable workflows to identify problems and generate reports, making them ideal for troubleshooting. Below, I’ve listed some of the most relevant open-source tools that meet your criteria, focusing on their ability to run on demand, troubleshoot across multiple system aspects, and report issues.

1. Sysdig

  • Description: Sysdig is an open-source, system-level monitoring and troubleshooting tool that provides deep visibility into processes, network connections, file I/O, CPU, memory, and logs. It’s designed for on-demand diagnostics and can generate detailed reports.
  • Features:
    • Monitors running processes with CPU/memory usage and parent-child relationships.
    • Captures network connections and open ports (similar to netstat or ss).
    • Tracks disk I/O and file access.
    • Analyzes system calls and logs for debugging.
    • Generates JSON or human-readable reports of system activity.
  • On-Demand Use: Run sysdig with filters (e.g., sysdig -c topprocs_cpu for top CPU processes) to troubleshoot specific issues and output results to a file for reporting.
  • Reporting: Supports exporting traces or summaries to files, which can be parsed or shared.
  • Platform: Primarily Linux, with limited Windows support.
  • Installation: Available via package managers (e.g., apt install sysdig on Ubuntu) or Docker.
  • Example: sysdig -p "proc.name=%proc.name cpu=%cpu" > report.txt to log process CPU usage.
  • Why It Fits: Combines process, network, disk, and log analysis in one tool with flexible, on-demand execution and reporting.

2. Glances

  • Description: Glances is a cross-platform, open-source system monitoring tool that provides a comprehensive view of system resources (CPU, memory, disk, network, processes, and logs) in a single interface. It’s lightweight and can be run on demand.
  • Features:
    • Displays top processes by CPU/memory usage.
    • Shows network usage and open connections.
    • Monitors disk I/O and filesystem usage.
    • Alerts on high resource usage or potential issues (e.g., memory saturation).
    • Can export data to CSV, JSON, or integrate with external tools like Grafana.
  • On-Demand Use: Run glances in terminal or web mode (glances -w) for real-time diagnostics. Use --export for reports.
  • Reporting: Exports metrics to files or databases, with customizable thresholds for issue detection.
  • Platform: Linux, Windows, macOS.
  • Installation: Install via pip (pip install glances) or package managers (apt install glances).
  • Example: glances --export-csv /tmp/glances-report.csv to generate a report of system metrics.
  • Why It Fits: Provides a unified view of system health, detects anomalies, and supports on-demand reporting.

3. Cockpit

  • Description: Cockpit is an open-source, web-based server management tool for Linux that allows on-demand monitoring and troubleshooting of processes, network, storage, and logs. It’s user-friendly for both admins and developers.
  • Features:
    • Monitors running processes and resource usage (CPU, memory).
    • Displays disk usage and I/O activity.
    • Shows network interfaces and traffic.
    • Views and searches system logs (via journalctl integration).
    • Alerts on critical issues like high CPU or disk space.
  • On-Demand Use: Access Cockpit via a browser (http://localhost:9090) to check system status and diagnose issues in real time.
  • Reporting: Logs and metrics can be exported manually, or you can integrate with external tools for automated reporting.
  • Platform: Linux (primarily RHEL, Fedora, Ubuntu, Debian).
  • Installation: apt install cockpit on Debian/Ubuntu or dnf install cockpit on Fedora.
  • Example: Use the web interface to view logs and save them as text files for reporting.
  • Why It Fits: Combines multiple monitoring aspects in a web interface, ideal for quick, on-demand troubleshooting.

4. Prometheus with Node Exporter

  • Description: Prometheus is an open-source monitoring system with a time-series database, and its Node Exporter collects system metrics (CPU, memory, disk, network, processes). While typically used for continuous monitoring, it can be run on demand for troubleshooting.
  • Features:
    • Collects detailed metrics on CPU, memory, disk, and network usage.
    • Monitors open file descriptors and process counts.
    • Tracks network interface statistics.
    • Integrates with Grafana for visualizing issues.
    • Alerts on thresholds (e.g., high CPU usage).
  • On-Demand Use: Start Node Exporter (./node_exporter) and query Prometheus via its web UI or API to analyze metrics and identify issues.
  • Reporting: Export metrics as JSON or use Grafana dashboards for visual reports.
  • Platform: Linux, with partial Windows/macOS support via exporters.
  • Installation: Download binaries from the Prometheus website or use package managers.
  • Example: curl http://localhost:9100/metrics > metrics.txt to save system metrics for analysis.
  • Why It Fits: Comprehensive metrics collection with flexible querying for on-demand diagnostics, though setup is slightly more involved.

5. Monit

  • Description: Monit is an open-source utility for monitoring and managing system processes, files, directories, and devices. It can be used on demand to check system health and report issues.
  • Features:
    • Monitors processes for CPU/memory usage.
    • Checks network ports and connections.
    • Tracks disk usage and filesystem health.
    • Analyzes logs for errors.
    • Sends alerts via email or executes scripts on issues.
  • On-Demand Use: Run monit status to get a snapshot of system health or use its web interface for interactive troubleshooting.
  • Reporting: Generates status reports and logs alerts to files or email.
  • Platform: Linux, Unix, with some Windows support.
  • Installation: apt install monit on Debian/Ubuntu or compile from source.
  • Example: monit report > system-report.txt to save a system status report.
  • Why It Fits: Lightweight and focused on actionable troubleshooting with reporting capabilities.

6. Custom Scripts with Standard Tools

  • Description: You can combine standard open-source tools (e.g., htop, ss, iostat, journalctl) into custom Bash or Python scripts to run on demand, troubleshoot issues, and generate reports. This approach is highly flexible.
  • Features:
    • Use ps or top for process monitoring.
    • Use ss or netstat for ports and connections.
    • Use iostat or df for disk/CPU metrics.
    • Use journalctl or grep for log analysis.
    • Aggregate outputs into a single report file.
  • On-Demand Use: Execute the script (e.g., ./troubleshoot.sh) to collect and analyze data.
  • Reporting: Scripts can output to text, CSV, or JSON files for reporting.
  • Platform: Linux (cross-platform with modifications).
  • Example Script:
    #!/bin/bash
    echo "System Troubleshooting Report - $(date)" > report.txt
    echo "=== Processes ===" >> report.txt
    ps aux --sort=-%cpu | head -n 10 >> report.txt
    echo "=== Open Ports ===" >> report.txt
    ss -tuln >> report.txt
    echo "=== Disk Usage ===" >> report.txt
    df -h >> report.txt
    echo "=== Recent Logs ===" >> report.txt
    journalctl -p 3 -n 50 >> report.txt
    echo "Report generated: report.txt"
  • Why It Fits: Fully customizable to your needs, leveraging existing tools for a tailored troubleshooting solution.

Recommendations

  • For Simplicity: Use Glances or Cockpit for a quick, unified interface that covers most troubleshooting needs with minimal setup.
  • For Deep Diagnostics: Use Sysdig for detailed system-level insights, especially for complex issues involving processes, network, or I/O.
  • For Flexibility: Write a custom script combining tools like ss, iostat, and journalctl for targeted, on-demand reports.
  • For Advanced Reporting: Set up Prometheus with Node Exporter and Grafana for detailed metrics and visual reports, though it requires more configuration.

Notes

  • Installation: Most tools are available in Linux package repositories (e.g., apt, yum, pacman) or via source compilation. For cross-platform tools like Glances, use pip or platform-specific installers.
  • Reporting: Ensure output formats (e.g., CSV, JSON, text) match your needs for sharing or further analysis.
  • Platform: Most tools are Linux-focused, but Glances and custom scripts can be adapted for Windows/macOS with tools like PowerShell or Process Explorer.
  • Automation: For recurring issues, tools like Monit or Prometheus can be configured to run periodically and alert on problems.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment