sysadmin desperation

You know it's bad when I'm posting random cries for tech help here... So yeah. It's that bad.

The webcast machine at the club loses its mind at least once a week: it appears to run out of memory and crash, but I can't figure out what the culprit is.

The machine is a dual CPU Athlon 2400+ with 1GB RAM and 500MB swap. It's running Fedora Core 3, but I was also experiencing this problem on FC2 and RH9. Memtest86 says the RAM is fine. It's got an Osprey 100 BT848 video capture card and an SB Live EMU10k1 audio card.

I set up a cron job that once a minute captures the output of "top -bn1" and "ps auxwwf" to a file. Here's are a pair of those files as it loses its mind. Note that the load goes from 3.44 to 22.73 in a minute and a half.

I've compared the two files character by character, and I don't see a smoking gun. The differences look quite trivial to me.

So while I was sitting there staring at this, I saw something very intersting happen: "top" was running on the machine's console, and showed 380MB swap available -- and the oom-killer woke up and shot down an xemacs and an httpd.

So, how's that even possible? Does this mean that some process has gone nuts and started leaking wired pages, so that it can't swap at all? Or what?

So, any ideas?


Update, Dec 29: It looks like something is leaking in the kernel; /proc/slabinfo shows the size-256 slab growing to 3,500,000 entries (over 800MB.) Current suspect is the bttv/v4l driver (since one of the things this machine does is run "streamer" to grab a video frame every few seconds.) That would be about 525 leaked allocations per minute, or around 26 leaks per frame.

kernel 2.6.9-1.681_FC3, xawtv-3.81-6.


Update, Jan 12: That was the culprit. This is the fix:

    --- ./drivers/media/video/bttv-driver.c.orig    2005-01-11 14:54:15.477911088 -0800
    +++ ./drivers/media/video/bttv-driver.c 2005-01-08 13:49:44.000000000 -0800
    @@ -2992,6 +2992,9 @@
                    free_btres(btv,fh,RESOURCE_VBI);
            }
     
    +       videobuf_mmap_free(file, &fh->cap);
    +       videobuf_mmap_free(file, &fh->vbi);
    +
     #ifdef VIDIOC_G_PRIORITY
            v4l2_prio_close(&btv->prio,&fh->prio);
     #endif
    
    --- ./drivers/media/video/video-buf.c.orig      2004-10-18 14:54:08.000000000 -0700
    +++ ./drivers/media/video/video-buf.c   2005-01-08 13:50:04.000000000 -0800
    @@ -889,6 +889,7 @@
            int i;
            
            videobuf_queue_cancel(file,q);
    +        videobuf_mmap_free(file, q);
            INIT_LIST_HEAD(&q->stream);
            for (i = 0; i < VIDEO_MAX_FRAME; i++) {
                    if (NULL == q->bufs[i])
Tags: , , , ,

52 Responses:

  1. ajaxxx says:

    what kernel version? the oom killer behaviour is rumored to be quite stupid in some of the 2.6 series.

    • jwz says:

      Currently 2.6.9-1.681_FC3, though like I said, this has been happening since RH9, which was ~2.4.20.

  2. Add to /etc/sysctl.conf:

    kernel.sysrq = 1

    Execute:

    sysctl -p

    When the system starts going psycho, hit Alt-SysRq-T, Alt-SysRq-M, and Alt-SysRq-W. If the system is alive enough, it'll write a whole bunch of nearly nonsensical shit into /var/log/messages. If not, you'll want to get a serial console hooked up to the system to capture it instead. Post it here and we can take a look.

    • brad says:

      SysRq-T often overflows the dmesg ring buffer before syslog gets to it, though.... it helps to increase it, if he's going to be building a new kernel anyway.

      Jamie, you might also log /proc/meminfo on the same interval.

      • Erm, who said anything about building a new kernel?

        Anyhow, if he can throw a serial console onto it (and said serial console is something like another system running minicom, instead of a vintage VT100 or somesuch) it shouldn't have trouble catching it. At least, IME I can't recall seeing customers have problems catching all of the output.

        • brad says:

          I didn't think you said anything that implied a new kernel (though I've been on enough machines without sysreq support), but I gotta imagine he's going to be building a new kernel eventually...

  3. Swap space seems light from what I recal of modern kernels. Heck, you might be better with swap off, though I would tend to put it at 2GB. Heck, make a 1.5GB swap file, and just have it a lower (higher numerical?) swap priority, and see what happens.

    My only real qualifications at this point are being a regular reader of kernel traffic.

  4. Are the 5 zombie processes a cause, or an effect?

    I am so damn tempted to suggest changing distros, just to watch the inevitable jwzrant.

    • jwz says:

      Beats me.

      You may have noticed that I no longer rant about that, now I merely delete-and-ban on the first offense.

    • bodyfour says:

      They're probably an effect -- I bet by this point enough of the parent "httpd" process is swapped out that it just isn't calling wait() very promptly when its children are being OOM-killed.

      A few zombie processes by themselves won't cause many problems by themselves -- all the process'es resources are already released; it just can't disappear from the process table until its parent calls wait() and retrieves the exit code

      As I said in my other reply this looks exactly like a kernel memory leak to me

      • treptoplax says:

        I bow to your superior nerdliness.

        Given that it sounds like a kernel/driver issue, I'd try running that system in single-CPU mode and see if the problem goes away. It looks like he might have just enough horsepower to get away with this, and I wouldn't be surprised to see that some random video capture driver has bugs with SMP Athlon systems...

    • treptoplax says:

      Effect, I think; that's the OOM-slaughtered apache processes you're seeing there.

      So, plenty of swap, but no RAM, the OOM killer is running rampant (there's nothing to swap out?), and the load is 22. How the hell is the load 22? There aren't twenty-two runnable processes!

      • bodyfour says:

        There probably are -- remember that most of the ones in diskwait "D" state are there because they're trying to run but are swapped out. They count towards load average too.

    • inoshiro says:

      Once my memory usage reaches close to the 768mb of RAM my machine has, I see a similar spike in system load until the OOM kills Mozilla or Thunderbird (the usual large memory chewers here). It, thankfully, won't eat X11. This is on Slackware 10 and Slackware current, both using the latest 2.6.x kernel.

      I think what really needs to happen is to have a good review of the Linux MM system and get these strange problems out (because, at least in my case, shrinking the disk cache would release ~150-200mb, depending on the size it's at). They seem to be an effect of half-baked MM behaviour to me.

      • bodyfour says:

        There has been a lot of talk lately on lkml about making the OOM killer a little less "twitchy" in certain circumstances. Maybe that is the problem you are having but it isn't what jwz is seeing. He's actually seeing the machine die due to lack of resources -- the OOM kills are just symptoms of that (but they're helpless since the kernel is leaking away the RAM)

        There is one misunderstanding here, though...

        > because, at least in my case, shrinking the disk cache would release ~150-200mb,

        Yes, but is the I/O heavy at the point that the OOM kill happens? It could well be that the system needs that amount of cache to keep your working set

        People here "out of memory" and assume that it really means completely out of all RAM and swap with all caches purged. Which is understandable, but it's wrong.

        Imagine a machine with a couple gigabytes of swap and a process that's quickly leaking RAM. Pretty soon all of the inactive processes will end up on swap and you'd be in a thrashing scenario where even the active processes (including the leaker) are pretty much running in swap and going about a million times slower than they should. Now the memory leak is only growing very slowly (since everything is going very slowly) and there's basically no way to log in and kill it. In theory eventually the leaker will run totally out of swap but this could literally take years of thrashing. Obviously this would make the OOM killer pretty useless since the admin would have to hit the Big Red Switch long before it would trigger; this is exactly what OOM-kill is trying to avoid.

        So rather than meaning "I'm totally out of memory" it means "I seem to be approaching thrashing; I'm worried that if I continue I might soon be unresponsive so I should do something NOW before that happens"

        This obviously requires some heuristics to determine and they can never be perfect (simply because the kernel can't predict the future -- for instance there could be a big simulation thats a millisecond away from completing and releasing all its RAM) Like any MM heuristic it's a constant tuning battle to make it work reasonably well on all workloads. As I said earlier there is a lot of talk that it's too trigger-happy for some people and various tweaks are being tried both in 2.4 and 2.6 that will hopefully make it work better.

        So I guess what I'm trying to say is this -- maybe it really is busted for you, but how much disk cache you currently have isn't really evidence one way or another. Thrashing is trashing regardless of how much cache or swap you have.

  5. bodyfour says:

    I suspect that you're hunch is correct -- you're leaking kernel memory. Add "cat /proc/slabinfo /proc/meminfo" to your periodic stats scripts; hopefully that will show a steady leak.

    • jwz says:

      Ok, how do I interpret this? The numbers seem to be getting bigger, but I don't know what that means, really.

      (These are both from before I added more swap.)

      • bodyfour says:

        Well to really know for sure you'll need to wait until the problem is clearly exhibiting itself. However I do think the growth of the size-256 slab looks interesting -- up from 66300 to 417480 objects. That's a rate of about 200MB/day which sounds fairly in line with what you're experiencing.

        So what that tells us is that whatever seems to be leaking is a kmalloc() allocation of a size between 257 and 512 bytes.

        Isn't debugging kernels fun? Now you can see how I became such a bitter person.

  6. bodyfour says:

    Oh, also:

    > Dec 23 17:12:36 cerebellum kernel: Active:508 inactive:6352 dirty:0 writeback:5845 unstable:0 free:233 slab:243624 mapped:4779 pagetables:2606

    Yeah, it looks pretty clear there that almost all of the pages on the machine are in the slab cache (i.e. what the kernel uses for dynamic memory allocations... it's version of malloc() so to speak) On your 1G system you have a grand total of 262140 4K pages and 243624 of them are in the slab.

    The next step is watching /proc/slabinfo to see what is getting really large before the box croaks -- that at least narrows the bug down a bit.

    Also could you send me an "/sbin/lsmod" output so I can see exactly whats loaded on the box? Thanks.

    • jwz says:
      Module                  Size  Used by
      parport_pc 24705 0
      parport 41737 1 parport_pc
      bttv 150541 1
      video_buf 21701 1 bttv
      i2c_algo_bit 8521 1 bttv
      v4l2_common 5953 1 bttv
      btcx_risc 4425 1 bttv
      videodev 9665 2 bttv
      md5 4033 1
      ipv6 232705 18
      i2c_dev 10433 0
      i2c_core 22081 3 bttv,i2c_algo_bit,i2c_dev
      dm_mod 54741 0
      button 6481 0
      battery 8517 0
      ac 4805 0
      uhci_hcd 31449 0
      ehci_hcd 31685 0
      snd_emu10k1_synth 7873 0
      snd_emux_synth 38977 1 snd_emu10k1_synth
      snd_seq_virmidi 6593 1 snd_emux_synth
      snd_seq_midi_event 8385 1 snd_seq_virmidi
      snd_seq_midi_emul 6593 1 snd_emux_synth
      snd_seq 56785 4 snd_emux_synth,snd_seq_virmidi,
      snd_seq_midi_event,snd_seq_midi_emul
      snd_emu10k1 93769 2 snd_emu10k1_synth
      snd_rawmidi 26725 2 snd_seq_virmidi,snd_emu10k1
      snd_pcm_oss 47608 1
      snd_mixer_oss 17217 1 snd_pcm_oss
      snd_pcm 97993 2 snd_emu10k1,snd_pcm_oss
      snd_timer 29765 2 snd_seq,snd_pcm
      snd_seq_device 8137 5 snd_emu10k1_synth,snd_emux_synth,
      snd_seq,snd_emu10k1,snd_rawmidi
      snd_ac97_codec 64401 1 snd_emu10k1
      snd_page_alloc 9673 2 snd_emu10k1,snd_pcm
      snd_util_mem 4801 2 snd_emux_synth,snd_emu10k1
      snd_hwdep 9413 2 snd_emux_synth,snd_emu10k1
      snd 54053 12 snd_emux_synth,snd_seq_virmidi,
      snd_seq,snd_emu10k1,snd_rawmidi,
      snd_pcm_oss,snd_mixer_oss,snd_pcm,
      snd_timer,snd_seq_device,snd_ac97_codec,
      snd_hwdep
      soundcore 9889 2 snd
      b44 22341 0
      8139too 26305 0
      mii 4673 2 b44,8139too
      floppy 58609 0
      sr_mod 17381 0
      ext3 116809 17
      jbd 74969 1 ext3
      aic7xxx 150681 0
      sd_mod 16961 0
      scsi_mod 118417 3 sr_mod,aic7xxx,sd_mod
      • edolnx says:

        The bttv driver in the kernel is known to be quite buggy, and I don't know if Fedora is patching it or not. My MythTV/Freevo box had similar problems until I applied a lot of v4l patches. Latest and greatest patches can be found here: http://dl.bytesex.org/patches/

    • transgress says:

      hrm but as I understood it, pretty much everything hits the slab first, even kmalloc() and the likes, and if the request can be satisfied from that, than it is. I am not a super kernel hacker, just a hobbyiest, but I was somewhat under the impression there was no way to say 'i want a chunk of memory and it had better not come from the slab damn you'

      however, none the less your advice for watching slabinfo probably isn't a bad one.

      jwz, a less scientific method if the problem prooves to be in user space is to start slaughtering processes one by one until it no longer happens.

      • bodyfour says:

        > it, pretty much everything hits the slab first, even kmalloc() and the likes

        Yes exactly, kmalloc is implemented on top of slab... it basically just converts the size you ask for into the next-larger slab so if you say kmalloc(600,GFP_KERNEL) it says "ah, you want something from the size-1024 slab"
        (It's actually a touch more complicated that but that's the basic idea)

        Other high-volume users of the allocators make their own slab caches to allow them to get allocations of EXACTLY the size they want (plus some other cool features of having your own cache to esoteric to get into here)

        Slab caching originally appeared in Solaris around 1993 and rapidly spread to most other current OSes. If you're interested in this sort of thing I'd highly recommend you read Jeff Bonwick's original paper on slab caching and his 2001 follow-up paper describing some of the more recent enhancements.

        So watching what slab grows will at least indicate about how big the leaking allocation is which might narrow it down a bit. If we're really lucky it'll be leaking from a special-purpose slab which would be even more revealing.

        > but I was somewhat under the impression there was no way to say 'i want a chunk of memory and it had better not come from the slab damn you'

        Under Solaris there's pretty much not (at least for the kernel heap area) but linux is pretty different in that regard. I'm 99% certain that __get_free_pages() and friends bypasses the slab allocator. Also linux has vmalloc() which builds larger-than-1-page contiguous allocations by using the MMU to glue a bunch of pages together; it is also separate. There is a lot more in Chapter 7 of the Linux Device Drivers book which is conveniently online.

        • transgress says:

          Slab caching originally appeared in Solaris around 1993 and rapidly spread to most other current OSes.

          Yea I knew that, so many people knock solaris, but they have been 'industry-strength' (tm) for a long time now, but maybe i just think people are like that because I sit next to 2 people at work who hate solaris.

          So watching what slab grows will at least indicate about how big the leaking allocation is which might narrow it down a bit. If we're really lucky it'll be leaking from a special-purpose slab which would be even more revealing.

          Ah, for some reason I thought you were suggesting that all of his memory was in slab so that he was running out of memory, which i suppose could be happening. With your explanation above however, it makes total sense.

          Going from memory, I am pretty sure most of the calls that deal with pages deal bypass the slab, vmalloc() as well- because, again going from memory, it uses some of those *pages() functions. Also, I am probably wrong here, but I was under the impression vmalloc() took a bunch of non-contiguous memory and mapped it so that it was contiguous in regards to the process point of view, however that may have been what malloc() itself did, I find myself not using this stuff often enough to remember it.

          Those papers you point out are good, as is 'understanding the linux virtual memory manager' (which i seem to remember is also online somewhere), im probably about half way through that book, however set it and linux device drivers aside for a herman hess book at the moment.

          didnt he say he had more than one of these boxes? if the others are doing fine and they are the same software-wise, then I would be inclined to think it may be hardware related.

  7. holytramp says:

    1) Do add swap to cover your ram -- I believe recent VMM gets
    upset if it cannot swap all dirty pages while memory usage
    gets high. Just add an extra 1G swap file of lower priority
    than your swap partition and see if it helps.

    2) Some people concern with stability of apache 2.0.x and
    still recommend 1.3.x. Personally i do not have any problems
    with a very similar setup, but my load is much lower
    (and i have 6GB of swap space -- 2GB on each harddisk I have
    in the machine).

    • bodyfour says:

      > 1) Do add swap to cover your ram

      See, sysadmins are superstitious types. For instance back in Version 7 on the PDP-11 there wasn't a real way to formally shut the system down; you just had to make sure the system was quiet, then run "sync" to get all the buffered data out to disk.

      The problem was that when "sync" returned there were still some cases where things might still be streaming from the disk controller or what not so if you were too quick on the switch you could still lose some data. I rule of thumb was created -- just run sync three times. By the time you were done typing the third command the first one would definitely be done. Simple and easy to remember -- even a barely-literate tape monkey could handle it.

      But here's the weird part -- people still do it. Hell, there's people who weren't even alive when it was needed who still dutifully run three syncs when shutting a machine down. Even funnier is that they often run "sync; sync; sync" which never would have made sense because the whole point of the rule was just to slow down the process a tiny bit.

      The "have more swap than physical RAM" thing is the same deal.

      There's a grain of truth to it: back in SunOS 4 (and maybe some other 80's-vintage *NIXs; I don't know) this really was a requirement. The swap allocation was done at page mapping time so for every non-pinned location in RAM you needed a space reserved to swap it to.

      This isn't the case in any modern OS I'm aware of. Yet still I hear this claim made a couple times a year -- it's just a rumor that's been traveling in circles among sysadmins for the last 15 years. I'll probably be hearing it for another 15 too.

      > 2) Some people concern with stability of apache 2.0.x and still recommend 1.3.x.

      Only a problem if you're using weird PHP modules *AND* you're using one of the multi-thread models of apache 2 (you can configure apache 2 to use the "pre-fork" model just like apache 1.3 if you think this might be a problem for you) Even then you'll just have an apache core-dump, not a hosed computer.

      RedHat has been shipping apache 2 exclusively since at least RHL9. It's has been stable for years now.

      • holytramp says:

        <<< The "have more swap than physical RAM" thing is the same deal.>>>

        No it is not. In early days of kernel 2.2 and before linux used to have a unified
        VM, so swap was just like poor man "RAM". It did change at 2.4.
        Here is some relevant discussion on the kernel list:
        http://www.kerneltraffic.org/kernel-traffic/kt20010126_104.html#2

        The bottom line is that the current vm does its best when it has essentially
        unlimited swap. swap >= 2*RAM seems a reasonable approximation of that.
        If you have trouble allocating a couple gigs off your harddisk, you probably need to add another drive.

        Merry Cristmas and Happy New Year.

      • drbrain says:

        Apache2 may be great, but I've had problems with mod_fastcgi and apache2 that went away when using apache1.3. When a fastcgi child process died the fastcgi process manager also died during the process of spawning a new child.

      • jwz says:
          sysadmins are superstitious types.

        I think you mean "a superstitious and cowardly lot."

        But what the hell, I've got space, so I added another 2GB of swap.

      • go_team_ari says:

        It's still a good idea to have at least as much swap as RAM, at least for smaller desktop systems. Other things can use swap too besides just offloading huge memory hogging programs - swsusp, for one; VMWare 5 Beta can also optionally put a portion of the running guest's memory directly into swap so as not to hog so much of the physical RAM.

    • edge_walker says:

      Load is lower — with which one? It's not clear from your comment.

  8. ciphergoth says:

    Putting aside the usual use a different distribution/recompile your kernel/sacrifice a chicken suggestions: can you afford to switch off overcommit altogether? See vm/overcommit-accounting in your kernel docs.

    • bodyfour says:

      No, changing the overcommit policy is just voodoo. I actually argued against even adding this tunable to the kernel because 99% of people who think they need it really don't.

      It basically just changes "memory allocations fail when the machine is truly hosed" to "memory allocations fail even though maybe there's no problem at all"

  9. gchpaco says:

    I was about to say "well, clearly you're running into one of the relentlessly stupid OOM situations recent Linux kernels frequently produce" but you have plenty of swap. The high idle percentage and 22.7 kernel load suggests that your machine is spending a great deal of time thrashing; this is common behavior (unfortunately).

    While there could be a kernel memory leak, I suspect there's a more pedestrian explanation. It would be better if your ps/top monitoring overlapped the OOM notifications in the syslog, but here's my best guess, based on very similar experiences. First, a fairly busy system has a burst of activity (for whatever reason), runs out of memory, and the system starts thrashing. Something happens to demand real RAM. Then the OOM comes up and kills stuff.

    Some things that are different from what I'm used to with similar situations on my little server: you have swap available. With the CPU at 56% idle with a load of 22.73, you're certainly maxing out your I/O bandwidth hitting the swap partition, and the OOM events may be triggered by the kernel concluding that it couldn't possibly write data out as fast as it comes in. But the only situation I can think of where that might come up is in real time events.

    So, how do you fix this? Well, my first reaction is "add more RAM". The OOM I'm not so sure about (although it's indicative of this general class of problem), but going from 3.44 to 22.73 with a 56% idle time is very characteristic of thrashing, and even if there weren't any OOMs the system would still be unusably slow.

    Additionally, I would suggest that you add more swap. Yes, even though you're not running out. Since you don't have enough swap to back all of RAM (and some change) there are different stupid thrashing problems that can occur with that.

    Incidentally, the above highlights that I am particularly unhappy with the Linux OOM behavior, although I find the general VM performance a damn sight better than the BSDs I've run. I deal with it by choking the machine with RAM and swap, which is not a real answer but is frequently a lot cheaper than my time and deals with it.

    • gchpaco says:

      Now that I read <lj user="bodyfour">'s post, the idea that the OOM is coming up because the kernel itself is snarfing all the RAM is more plausible. I would check that out before throwing more RAM at the problem; if there is a big kernel leak more RAM will just put off the crash.

    • bodyfour says:

      I know you already replied again on this thread but I thought I'd point a couple other things out.

      > but you have plenty of swap.

      OOM is almost entirely unrelated to how much swap you have

      > So, how do you fix this? Well, my first reaction is "add more RAM".

      Look at the top output after the machine has been up for an hour -- there's over 400M of RAM that hadn't even been touched yet (so the page cache probaby still has every fs page we've ever even looked at, for instance) I'd say this box probably has at least twice as much RAM as it really needs (not that that's a bad thing, mind you, having plenty of RAM is great, especially at today's prices)

      • gchpaco says:

        Thanks for the comment; it's been some years since I seriously followed kernel development, and on my main server I've only ever seen OOM errors after it has eaten every bit of real and swap. I've also never had the Linux kernel seriously leak memory on me--the fact that I didn't realize to look for it or even what to look for embarasses me somewhat.

        • chromal says:

          Yeah, I think there's something to this idea. We recently experienced a problem where two systems running the same software (C/Perl and mysql based network stats software) experienced a memory leak after a FC2 errata kernel autoupgrade. That same kernel running on other hardware ran without a leak, but, for whatever reason, the load or hardware on these machines produced a memory behavior that would fill swap and then physical memory and cause the system to become unresponsive to user-land tasks.

          In particular, FC2 kernel 2.6.5-1.358 was fine, and both 2.6.9-1.3_FC2 and 2.6.9-1.6_FC2 experienced a steady apparent memory leak. Booting on the older kernel removed the problem.

          Beyond that, it's hard for me to say much of merit. I'm pretty unhappy that the problem is apparently the *kernel* or a major device driver.

  10. saintnobody says:

    I've seen drastic increases in load average like that before. The basic problem was that processes waiting for IO are counted, and there was an unreliable nfs mount.

    You have quite a few processes waiting for IO in the "going down" list. I wouldn't be the least bit surprised if the pages of the waiting-for-IO processes are locked in physical memory as well, so maybe your oom problem is acutally a problem with an IO bottleneck. I'm probably full of shit, but it's a theory I would check out.

    Are any of these processes using remote mounted filesystems, or is it local filesystems?

  11. baconmonkey says:

    I dunno if you bothered adding up the memory usage totals, but in the total sum of the VSZ columns from ps for the files are
    C1: 435704
    C2: 315816
    C3: 376136

    Which, if I understand things right (and I could very well be wrong), means that VSZ = code+date+stack, thus your running processes have not even requested half of the available memory.

    • bodyfour says:

      You're generally better off looking at the RSS (resident set size) numbers since that's how much physical RAM is really being taken up by the process. VSZ includes any virtual memory areas in the process including things that aren't paged in (like parts of the program that have never been accessed) or things that aren't even RAM (unused parts of mmaped files) If you look at an X server the VSZ will be huge since it usually includes all of the video card's memory.

      But, yeah, the overall story is that none of the userland processes are using a particularly large amount of RAM; the kernel is eating it all.

      • baconmonkey says:

        I figured the VSZ would be the Worst-Case-Scenario, showing what the absolute most ammount of memory the processes could be eating, which, presumambly they are in the "Boot +1hour" file, as there is no swap used yet. It seemed that if those numbers wern't even approaching the system's memory, that would definatley lend weight to the naughty kernel theory.

        none of the userland processes are using a particularly large amount of RAM; the kernel is eating it all.

        JWZ, feed your kernel, it's hungry!

  12. exiledbear says:

    Every once in a while, I need a reminder of why I wanted to forget so much about sysadminning. It's a thankless job - people only notice you when something goes wrong.

    Memory looks OK. CPU usage from the processes is near zero. I'd say the oom-killer is probably just making a system call to get the total free memory and if it falls below a certain threshold, it finds the first process taking up the most memory and kills it.

    I'd say the problem is somewhere in the kernel. Wouldn't be the first time a kernel has leaked memory, and oh God, that brings back some bad memories. I'm stopping now.

    Thank you for this little trip down nightmare on sysadmin street.

  13. mark242 says:

    Random crashes often point to some sort of hardware/driver failure, unless there's some sort of attack being perpetrated upon the machine.

    Can you afford to unplug the ethernet cable for a week and see if the problem persists?

  14. pault12345 says:

    If it is not software (stays between different distros) then
    try chaging the hardware.

  15. go_team_ari says:

    I happen to just find this today, maybe it will help (especially about the Kernel Core Dump part).

    http://www.samag.com/documents/s=7762/sam0301f/0301f.htm

  16. sunsetdriver says:

    you've had this problem in 2.4 and 2.6 kernels. this machine does video capture. i seem to recall that video capture cards on x86 boxes had very specific ram requirements - had to dump data to specific areas of physical ram which conflicted with how the vm people wanted to do things.

    if this was my box i'd look into that angle.

  17. angryskul says:

    Buh, probably some stupid linux bug in the swapper/oom code.

    Add another 512 megs of ram and you should be fine.

    What's happening is something buggy is waking up all your swapped out processes and making it shit the bed because suddenly you have 100megs to swap in.

    Another weird thing that can be happening is that the oom killer is running because you're low on kernel memory, not system memory, killing processes will usually free up kernel (non-swappable) memory.

    I am talking out of my butt a bit here, but just some ideas.