Debugging Linux oom-killer – little to no swap use Code Answer

Hello Developer, Hope you guys are doing great. Today at Tutorial Guruji Official website, we are sharing the answer of Debugging Linux oom-killer – little to no swap use without wasting too much if your time.

The question is published on by Tutorial Guruji team.

We build a system that’s intended to be on all the time – it collects and displays graphs of data. If we leave it without changing anything for long enough, we end up with an oom-killer event. That kills our main process (it’s got the high oom-score) and our software gets restarted.

Basics: The system is CentOS 6, kernel is 2.6.32.26. The system has 2G of ram and 4G of swap. The application is written in C++ w/Qt 3.

I’ve set a cron job to grab the contents of /proc/meminfo and /proc/slabinfo every minute. Here’s the traces I find most interesting from the meminfo data (the most recent oom-killer is on the right side of the graph):
meminfo

Note SUnreclaim grows until the oom-killer hits. The change in slope on SUnreclaim is where I switched displays.

Here’s some interesting traces from the slabinfo data:
slabinfo sizes

What this looks like to me is that something’s leaking or fragmenting. Whatever it is does seem to get cleaned up when my processes die, but I honestly have no idea what’s going on here.

How do I figure out what’s leaking?

Updated:
Early on in this process, I started with ps output (not shown here). All of our processes RSS values ramp up quickly to their ‘normal’ level and then stay put. If this was a process running away with normal memory, I wouldn’t need assistance. Instead, there’s something we’re doing that’s causing unswappable memory to be allocated.

As to the upgrade suggestion: The codebase has a lot of dependencies on old libraries, and I can’t make a transition to even a 3 series kernel right now.

Answer

You’ve asked two questions.

1) If the OOM Killer runs + you have no swapping, likely this relates to your vm.swappiness setting. Try setting this to 1. On your antiquated + highly hackable kernel (shudder), setting to 0 (as I recall), disables swapping completely, which likely isn’t what you’re after.

2) Determining your leaking program might be as easy as running ps auxww repeatedly looking for constantly increasing RSS values or some other metric.

All this said…

Your Kernel is very old. PHP is capped at 5.3 (highly hackable). OpenSSL is buggy. Many related libraries are old + may be the source of memory leaks.

Likely best to upgrade to a recent Distro. A simple upgrade may install more recent code with addresses your memory leakage.

We are here to answer your question about Debugging Linux oom-killer – little to no swap use - If you find the proper solution, please don't forgot to share this with your team members.

Related Posts

Tutorial Guruji