sysadmin desperation

You know it's bad when I'm posting random cries for tech help here... So yeah. It's that bad.

The webcast machine at the club loses its mind at least once a week: it appears to run out of memory and crash, but I can't figure out what the culprit is.

The machine is a dual CPU Athlon 2400+ with 1GB RAM and 500MB swap. It's running Fedora Core 3, but I was also experiencing this problem on FC2 and RH9. Memtest86 says the RAM is fine. It's got an Osprey 100 BT848 video capture card and an SB Live EMU10k1 audio card.

I set up a cron job that once a minute captures the output of "top -bn1" and "ps auxwwf" to a file. Here's are a pair of those files as it loses its mind. Note that the load goes from 3.44 to 22.73 in a minute and a half.

I've compared the two files character by character, and I don't see a smoking gun. The differences look quite trivial to me.

So while I was sitting there staring at this, I saw something very intersting happen: "top" was running on the machine's console, and showed 380MB swap available -- and the oom-killer woke up and shot down an xemacs and an httpd.

So, how's that even possible? Does this mean that some process has gone nuts and started leaking wired pages, so that it can't swap at all? Or what?

So, any ideas?

Update, Dec 29: It looks like something is leaking in the kernel; /proc/slabinfo shows the size-256 slab growing to 3,500,000 entries (over 800MB.) Current suspect is the bttv/v4l driver (since one of the things this machine does is run "streamer" to grab a video frame every few seconds.) That would be about 525 leaked allocations per minute, or around 26 leaks per frame.

kernel 2.6.9-1.681_FC3, xawtv-3.81-6.

Update, Jan 12: That was the culprit. This is the fix:

    --- ./drivers/media/video/bttv-driver.c.orig    2005-01-11 14:54:15.477911088 -0800
    +++ ./drivers/media/video/bttv-driver.c 2005-01-08 13:49:44.000000000 -0800
    @@ -2992,6 +2992,9 @@
    +       videobuf_mmap_free(file, &fh->cap);
    +       videobuf_mmap_free(file, &fh->vbi);
    --- ./drivers/media/video/video-buf.c.orig      2004-10-18 14:54:08.000000000 -0700
    +++ ./drivers/media/video/video-buf.c   2005-01-08 13:50:04.000000000 -0800
    @@ -889,6 +889,7 @@
            int i;
    +        videobuf_mmap_free(file, q);
            for (i = 0; i < VIDEO_MAX_FRAME; i++) {
                    if (NULL == q->bufs[i])
Tags: , , , ,