The system reboots with the below error message :
-bash: fork: Cannot allocate memory
Unable to login to the user with the following error on the console.
System error for create: “Cannot allocate memory”
The system does not respond and gives below error at the login or while running commands.
-bash: fork: Cannot allocate memory
To get out of this condition without rebooting, you can trigger the OOM killer manually as follows:
echo 1 > /proc/sys/kernel/sysrq echo f > /proc/sysrq-trigger echo 0 > /proc/sys/kernel/sysrq
How does the OOM killer decide which process to kill first?
If memory is exhaustively used up by processes, to the extent which can possibly threaten the stability of the system, then the OOM killer comes into the picture.
NOTE: It is the task of the OOM Killer to continue killing processes until enough memory is freed for the smooth functioning of the rest of the process that the Kernel is attempting to run.
The OOM Killer has to select the best process(es) to kill. Best here refers to that process which will free up the maximum memory upon killing and is also the least important to the system.
The primary goal is to kill the least number of processes that minimizes the damage done and at the same time maximizing the amount of memory freed.
To facilitate this, the kernel maintains an oom_score for each of the processes. You can see the oom_score of each of the processes in the /proc filesystem under the pid directory.
$ cat /proc/10292/oom_score
The higher the value of oom_score of any process, the higher is its likelihood of getting killed by the OOM Killer in an out-of-memory situation.
How is the OOM_Score calculated?
The calculation is a simple question of what percentage of the available memory is being used by the process. If the system as a whole is short of memory, then “available memory” is the sum of all RAM and swap space available to the system.
The OOM situation is caused by exhausting the memory allowed to a given cpuset/control group, then “available memory” is the total amount allocated to that control group. A similar calculation is made if limits imposed by a memory policy have been exceeded. In each case, the memory use of the process is deemed to be the sum of its resident set (the number of RAM pages it is using) and its swap usage.
This calculation produces a percent-times-ten number as a result; a process which is using every byte of the memory available to it will have a score of 1000, while a process using no memory at all will get a score of zero. There are very few heuristic tweaks to this score, but the code does still subtract a small amount (30) from the score of root-owned processes on the notion that they are slightly more valuable than user-owned processes.
One other tweak which is applied is to add the value stored in each process’s oom_score_adj variable, which can be adjusted via /proc. This knob allows the adjustment of each process’s attractiveness to the OOM killer in user space; setting it to -1000 will disable OOM kills entirely, while setting to +1000 is the equivalent of painting a large target on the associated process.
Will Linux start killing processes without asking if memory gets short?
There are two different out of memory conditions you can encounter in Linux. Which you encounter depends on the value of sysctl vm.overcommit_memory (/proc/sys/vm/overcommit_memory)
The kernel can perform what is called ‘memory overcommit’. This is when the kernel allocates programs more memory than is really present in the system. This is done in the hopes that the programs won’t actually use all the memory they allocated, as this is a quite common occurrence.
overcommit_memory = 2
When overcommit_memory is set to 2, the kernel does not perform any overcommit at all. Instead when a program is allocated memory, it is guaranteed access to have that memory. If the system does not have enough free memory to satisfy an allocation request, the kernel will just return a failure for the request. It is up to the program to gracefully handle the situation. If it does not check that the allocation succeeded when it really failed, the application will often encounter a segfault.
In the case of the segfault, you should find a line such as this in the output of dmesg:
[1962.987529] myapp: segfault at 0 ip 00400559 sp 5bc7b1b0 error 6 in myapp[400000+1000]
The at 0 means that the application tried to access an uninitialized pointer, which can be the result of a failed memory allocation call (but it is not the only way).
overcommit_memory = 0 and 1
When overcommit_memory is set to 0 or 1, overcommit is enabled, and programs are allowed to allocate more memory than is really available.
However, when a program wants to use the memory it was allocated, but the kernel finds that it doesn’t actually have enough memory to satisfy it, it needs to get some memory back. It first tries to perform various memory cleanup tasks, such as flushing caches, but if this is not enough it will then terminate a process. This termination is performed by the OOM-Killer. The OOM-Killer looks at the system to see what programs are using what memory, how long they’ve been running, who’s running them, and a number of other factors to determine which one gets killed.
After the process has been killed, the memory it was using is freed up, and the program which just caused the out-of-memory condition now has the memory it needs.
However, even in this mode, programs can still be denied allocation requests. When overcommit_memory is 0, the kernel tries to take a best guess at when it should start denying allocation requests. When it is set to 1, I’m not sure what determination it uses to determine when it should deny a request but it can deny very large requests.
You can see if the OOM-Killer is involved by looking at the output of dmesg, and finding a messages such as:
[11686.043641] Out of memory: Kill process 2603 (flasherav) score 761 or sacrifice child [11686.043647] Killed process 2603 (flasherav) total-vm:1498536kB, anon-rss:721784kB, file-rss:4228kB