Last active
November 8, 2024 17:00
-
-
Save makeittotop/58996c0c9e861b8d179a to your computer and use it in GitHub Desktop.
Excellent discussion on OOM-Killer
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Out of Memory (OOM) refers to a computing state where all available memory, including swap space, has been allocated. | |
Normally this will cause the system to panic and stop functioning as expected. | |
There is a switch that controls OOM behavior in /proc/sys/vm/panic_on_oom. | |
When set to 1 the kernel will panic on OOM. | |
A setting of 0 instructs the kernel to call a function named oom_killer on an OOM. | |
Usually, oom_killer can kill rogue processes and the system will survive. | |
The easiest way to change this is to echo the new value to /proc/sys/vm/panic_on_oom. | |
# cat /proc/sys/vm/panic_on_oom 1 | |
# echo 0 > /proc/sys/vm/panic_on_oom | |
# cat /proc/sys/vm/panic_on_oom 0 | |
The RHEL5 kernel includes 2 files for each process to control when that process will be considered for termination when the system must start OOM killing. | |
/proc/<pid>/oom_adj - Adjust the oom-killer score. | |
This file can be used to adjust the score used to select which processes shall be killed in an out-of-memory situation. | |
Giving a process a high score, increase the likelihood of this process being killed by the oom-killer. | |
Valid values are in the range [-16:15], plus the special value -17, which disables oom-killing that process altogether. | |
Example: echo 15 > proc/<pid>/oom_adj significantly increase the likelihood that process <pid> will be OOM killed. | |
Example: echo -16 > proc/<pid>/oom_adj significantly decrease the likelihood that process <pid> will be OOM killed. | |
Example: echo -17 > /proc/<pid>/oom_adj will disable OOM killing for process <pid> totally. | |
NOTE: The oom score is passed from parent process to child process during fork() operations. | |
/proc/<pid>/oom_score - Display current oom-killer score. | |
This file can be used to check what the current score used by the oom-killer for any given <pid>. | |
Use it together with /proc/<pid>/oom_adj to tune which process will be killed in an out-of-memory situation. | |
Example: cat /proc/<pid>/oom_score will display the current OOM score for process <pid>. | |
A function called badness() is used to determine the actual score for each process. | |
This is done by adding up 'points' for each examined process. | |
The process scoring is done in the following way: | |
The basis of each of the process's scores is its memory size. | |
The memory size of any of the process's children (not including a kernel thread) is also added to the score. | |
The process's score is increased for niced processes and decreased for long running processes. | |
Processes with the CAP_SYS_ADMIN and CAP_SYS_RAWIO capabilities have their scores reduced. | |
The final score is then bitshifted by the value saved in the oom_adj file. | |
Thus, a process with the highest oom_score value will most probably be a non-privileged, recently started process that, along with its children, uses a large amount of memory, has been niced, and handles no raw I/O. | |
Each process in Linux has a OOM score assigned to it. | |
Its value is primarily based on the amount of memory a process uses. | |
Whenever system is about to run out of memory, OOM killer terminates the program with the highest score. | |
To prevent it from killing a critical application, such as for example a database instance, the score can be manually adjusted. | |
It is possible through /proc/[pid]/oom_score_adj (or /proc/[pid]/oom_adj for kernels older than 2.6.29). | |
The range of values which oom_score_adj accepts is from -1000 to 1000, or from -17 to 15 in the deprecated interface that relies on oom_adj. | |
The score is either reduced or increased by the adjustment value. | |
For example to reduce chances of loosing postgres process: | |
unix# ps axf | grep postgres | |
107425 pts/4 S+ 0:00 \_ grep postgres | |
107191 ? S 0:00 /opt/.../lib/psql/bin/postgres -D /var/spool/tractor/psql -p 9876 | |
# cat /proc/107191/oom_score | |
124 | |
# echo -1000 > /proc/107191/oom_score_adj | |
# cat /proc/107191/oom_score | |
0 |
@makeittotop brilliant! thanks for sharing.
This is outdated. The current score is just RSS+swap usage for a process. No adjusting score for nice or long-running processes since kernel v2.6.36, no bonuses for CAP_SYS_ADMIN since kernel v4.17.
Thanks @AlexeyDemidov, good to know!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@makeittotop Great explanation about oom killer.