Invisibles

This is great!

A quiz where you have to guess the movie from the sets and clothing, because all of the flesh has been photoshopped out! These pictures are really well done, and it's surprisingly hard...

Tags:

really hard bug

I found a reproducible xscreensaver crash, and I'm still at a loss as to how to find a fix for it. Help me figure this out and I will concede that your kung fu is the best. <lj-cut text="how I spent the last six hours"> To reproduce:

  • have `fade' and `unfade' turned on;
  • xscreensaver-command -activate
  • after it starts fading out, click the mouse (to stop it);
  • when it starts fading back in, click the mouse (to stop it again);
  • Xlib: sequence lost (0x2b714 > 0x1b714) in reply type 0x0!

"Sequence lost" means "game over." What it means is that Xlib went to read a reply from the X server (on the file descriptor that is its connection to the server) and it got nonsense back. There are three typical causes of this:

  • Making calls to Xlib routines from a signal handler (since Xlib is (usually) not re-entrant, this can cause a write or read to happen on the fd out of turn, confusing everything); or
  • Doing a printf to the X server's fd (this can happen if your program was launched with its stdin/stdout/stderr descriptors closed (that is, >&- rather than >/dev/null.) What can happen in that case is that the Xlib file descriptor gets allocated as fd #3, which is the magic stderr number. Then your program tries to write to stderr, and you blow up X); or

  • Similar lossage to the previous way, except involving forked subprocesses that inherited the Xlib fd and then did bad things to it.

So, of course, I have a high degree of confidence that none of these three things are happening. Yet, when I do that set of actions, I get "sequence lost" in _XSetLastRequestRead() from XlibInt.c in the bowels of Xlib. The backtrace is always:

    _XSetLastRequestRead (Xlib/XlibInt.c)
    _XError (Xlib/XlibInt.c)
    _XGetWindowProperty (Xlib/GetProp.c)
    kill_xsetroot_data_1 (xscreensaver/driver/windows.c)

Just after the "sequence lost" error, xscreensaver goes down with a BadImplementation error from XGetProperty(). This error is likely spurious: a result of the client trying to interpret nonsense data from the server, and noting that it's nonsense.

There's only one call to _XError() in XGetWindowProperty(), and it's getting called because the reply has format = 0, which is nonsense:

    type = 1,
    format = 0,
    sequenceNumber = 1472,
    length = 0,
    propertyType = 20971521 (0x1400001),
    bytesAfter = 1,
    nItems = 4

That 0x1400001 looks suspicious, but it's neither an atom ID nor a window ID. The property it's trying to read (actually, read-and-delete) is _XSETROOT_ID, and the window does not, in fact, have that property on it. (This is normal and often expected.) Which means XGetWindowProperty() should just return non-zero (meaning, "could not read-and-delete that property.") I guess this means that it should have gotten a non-zero value from _XReply(), but instead, it's getting 0, and trying to read a (nonsense) reply. Which is consistent with the theory that the stream of bytes coming back from the server has been horked (the server presumably really did send back a 1, but we're seeing a 0, and thus running off into the weeds.)

If I set a breakpoint on kill_xsetroot_data_1 and don't allow it to ever call XGetWindowProperty(), then lose sequence later with "unexpected async reply (sequence 0x664)!" from the next call to _XReply(), XEventsQueued(), XtAppNextEvent(). So the problem is still upstream somewhere.

What a total nightmare! I'm running with -sync, so it can't be far upstream, but I'm stumped as to how to narrow it down any farther.

I suppose the next step is to re-learn how to run Xnest or a protocol monitor. I can't remember the last time I've had to resort to that!

Tags: , , ,
Current Music: the disco crapfest we call "DNA Lounge friday night"