10.6 memory corruption

I'm working on getting xscreensaver running on 10.6, and roughly 2/3rds of the time any of the savers launch, they crash here:

Program received signal: "EXC_BAD_ACCESS".
#0 0x00007fff825df445 in +[NSLayoutManager(NSPrivate) _doSomeBackgroundLayout] ()
#1 0x00007fff825df23f in _NSPostBackgroundLayout ()
#2 0x00007fff84cfe437 in __CFRunLoopDoObservers ()
#3 0x00007fff84cda6e4 in __CFRunLoopRun ()
#4 0x00007fff84cda03f in CFRunLoopRunSpecific ()
#5 0x00007fff86e50c4e in RunCurrentEventLoopInMode ()
#6 0x00007fff86e50a53 in ReceiveNextEventCommon ()
#7 0x00007fff86e5090c in BlockUntilNextEventMatchingListInMode ()
#8 0x00007fff824a5570 in _DPSNextEvent ()
#9 0x00007fff824a4ed9 in -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] ()
#10 0x00007fff8246ab29 in -[NSApplication run] ()
#11 0x00007fff82463844 in NSApplicationMain ()
#12 0x000000010000342c in main (argc=1, argv=0x7fff5fbff380) at /Users/jwz/src/xscreensaver/OSX/main.m:16

I'm not even using NSLayoutManager (as far as I know), so presumably this is some random memory corruption that happened somewhere else entirely, but turning on the MallocCheckHeap and related environment variables doesn't reveal anything. Any ideas how to debug this?

None of the XCode "Performance Tools" seem even remotely useful for tracking down memory corruption, but maybe I just don't understand them. Anyway, I thought the garbage collector was supposed to make everything sweetness and light?

Tags: , , , ,

11 Responses:

  1. evan says:

    The browser vendors have been funding valgrind for the mac. I think we use it already. Looks like you might need to build it from their svn repo though.

    • jwz says:

      % port install valgrind
      ...
      configure: error: Valgrind works on Darwin 9.x (Mac OS X 10.5)

      These days when I hear "build it from svn" you might as well be saying "try it under Windows".

  2. treptoplax says:

    I'm pretty sure the XCode "Performance Tools" are just a GUI slapped over some trival DTrace scripts. While I'm as big a DTrace fanboy as anyone (a Turing-complete tracing tool is really, really nifty) I'm not sure it actually buys you much for this.

  3. duskwuff says:

    Ordinarily I'd suggest MallocDebug.app, but it looks as though that's 32-bit only. Ugh. :(

  4. ncmike4 says:

    Try the MallocGuardEdges environment variables. It's not as elegant as valgrand, but as long as your program doesn't allocate loads of memory, it should do the trick.

    MallocGuardEdges If set, add a guard page before and after
    each large block.

    check 'man malloc' for details of this and other fun malloc options.

    • jwz says:

      Yes, I set every one of those malloc variables. That's what the "and related" was. It caught nothing.

      • primaleph says:

        Is is possible the packagers of the Really Slick Screensavers, or maybe the Electric Sheep team, could offer you advice on how to handle this bug?

  5. chanson says:

    NSLayoutManager will be used by all text drawing in a Cocoa process; it's how the text system decides where to place glyphs.

    Given that this is a crash in some code you don't actually call, I suspect that your code needs to be adapted to GC and won't just magically work with it. autozone is not just a malloc replacement like the Boehm collector; the GC heap and the malloc heap are separate and have distinct rules. There's a Programming Guide that discusses what needs to be done to be done to make Objective-C code GC-compatible, especially in situations where it's being mixed with C code.

    For example, if you have a malloc'd struct with a void * field, and you store a pointer to an Objective-C object into it, you need to CFRetain that object so the collector knows there's an additional reference to the object. (Stores into Objective-C objects get write barriers generated by the compiler, not so for stores into C structs.) Otherwise the collector may think there are no more references to the object, collect it, and sooner or later you're writing through a wild pointer.

    • jwz says:

      Doh! I hadn't thought of that. Your theory appears to be correct, because the crashes go away if I do [[NSGarbageCollector defaultCollector] disable] in initWithFrame. But without that, even after adding some CFRetain calls to the few places I store ObjC objects into C structs, I still get the crashes. Are there any tools for debugging GC lossage like this?