now that's some fancy debugging

Galileo Gets a Long-Distance Repair Job

[...] Damage from naturally strong radiation near Jupiter had left Galileo's tape recorder inoperable for weeks. Galileo's flight team traced the problem to a light-emitting diode in the electronics controlling the motor drive, and then gradually and carefully completed a successful long-distance repair job.

[...] The recovery was achieved by running a current through the damaged diode to anneal, or repair, radiation-caused damage. The first annealing attempt of six hours produced barely discernible improvement. Three additional treatments, for a total of 83 more hours of annealing treatment, produced progressive improvements, to the point that the tape recorder can run for about an hour at a time.

[...] The diode that radiation apparently damaged in the tape recorder is a gallium-arsenide semiconductor component that emits light. The motor-drive control has three of them. Light from them shines through windows in a rotating wheel onto detectors on the other side of the wheel. That setup senses the turning of the wheel and feeds digital logic that controls drive signals for the motor.

The damage apparently came from high-energy protons from Jupiter's radiation belt displacing atoms in the semiconductor's crystalline molecular lattice. Passing a current through the diode for hours serves as a way for electron flow to cause some of the displaced atoms to shift back to their original lattice positions.

Galileo has nearly depleted its supply of the propellant needed for pointing its antenna toward Earth and controlling its flight path. While still controllable, it has been put on a course for impact into Jupiter next September. The maneuver prevents the risk of Galileo drifting to an unwanted impact with the moon Europa, where it has discovered evidence of a subsurface ocean that is of interest as a possible habitat for extraterrestrial life.

Tags: ,

Brin on Lord of the Rings

J.R.R. Tolkien -- enemy of progress

[...] Now ponder something that comes through even the party-line demonization of a crushed enemy -- this clear-cut and undeniable fact: Sauron's army was the one that included every species and race on Middle Earth, including all the despised colors of humanity, and all the lower classes.

Hmm. Did they all leave their homes and march to war thinking, "Oh, goody, let's go serve an evil Dark Lord"?

Or might they instead have thought they were the "good guys," with a justifiable grievance worth fighting for, rebelling against an ancient, rigid, pyramid-shaped, feudal hierarchy topped by invader-alien elfs and their Numenorean-colonialist human lackeys?

If you haven't read his criticisms of Star Wars, they're well worth the time:

"Star Wars" despots versus "Star Trek" populists

Just what bill of goods are we being sold, between the frames? Elites have an inherent right to arbitrary rule; common citizens needn't be consulted. They may only choose which elite to follow. "Good" elites should act on their subjective whims, without evidence, argument or accountability. Any amount of sin can be forgiven if you are important enough. True leaders are born. It's genetic. The right to rule is inherited. Justified human emotions can turn a good person evil.

And parts 2 and 3:

Tags: , ,

Timekeeping in the Interplanetary Internet

I think they're not taking relativistic effects into account, but I'm not totally sure...

Expanding the Network Time Protocol for interplanetary use:

[...] The current NTP technology has no provisions for mobile servers and clients, where range and range rates can vary with time, and only minimal provisions for intermittent connectivity. In the Mars internet, orbiters and surface stations may have only intermittent connectivity, while in the DSN segment real-time connectivity is possible only at scheduled opportunities and then only with very long delays. These considerations are mitigated by the fact that ranges and range rates can be predicted with some accuracy from the known positions of the spacecraft bus, orbiters and surface stations using ephemerides maintained by astronomical means.

[...] It may will happen that residual clock frequency offsets may introduce considerable error if the time between updates is relatively long, as would be expected during communication opportunities between Earth and mission spacecraft. After a few measurements the frequency can be disciplined in the usual way, but this affects the position and velocity vectors and residuals with respect to the ephemeris. What makes frequency-induced errors more nasty is that the frequency may fluctuate due to spacecraft thermal cycles and power management.

Tags: , ,

Dali Clock, week one.

An update to last week's retrocomputing adventure: the Mac ("Model M0001"!) has been running Dali Clock like a trooper since last week, except for having inexplicably crashed once. (But hey, four days is still probably a record uptime for a Mac, right?)

But I'm sad to report that it can't keep time worth a damn: it's running ten minutes ahead already! After four days! That's pretty bad.

(Wait, I'll just run NTP. Right after I get TCP and PPP going.)

Tags: , , , ,

m-m-m-max

Frewer Eyes More Headroom?

Matt Frewer told a chat on SCIFI.COM that he's trying to resurrect his most famous character: Max Headroom. "We're putting together a deal on a new Max Headroom project," Frewer told fans. "Then I'm doing a film with my brother. The Headroom project is still in the deal-making process, so I can't say anything about it."

Frewer played the "computer-generated" character and his human counterpart, Edison Carter, in a British TV series, TV movie and subsequent American series set "22 minutes in the future." Frewer said that he's pleasantly surprised by the character's continuing popularity. "When we were making it, we knew it was way ahead of its time," he said. "I think if it was on the air [now] it would still look cutting-edge. I don't think the network was ready for it. It made a huge splash over a short time. It went as quickly as it came. That in a way was probably a plus. It never had time to go stale. Always leave 'em wanting more."

Tags: ,

I am worthless and weak.

My kung fu is not the best.

This X bug is still kicking my ass, for the third day. I was able to reproduce it on a second machine, and I watched it happen literally hundreds of times, and I still have no idea what's causing it. I even got a debug build of Xlib going, and have been single stepping through the library, watching it pull bits off the wire and assemble them into events, and I still haven't been able to catch it in the act of going south. For a long time, it looked like it was malfunctioning every time it tried to call XGetWindowProperty() with the `delete' flag set (for a while, it was always getting a BadImplementation error down in XGetWindowProperty() because the reply it was seeing had a `type' of 1 (XA_PRIMARY, which is nonsense) but a `format' of 0 (also nonsense.)

But no, sometimes it only fails much later, after it has gone back to the main loop and run some Xt timer functions (which are polled, not signal-based.) But only if XGetWindowProperty() has already been called three times. (Yeah, sure.)

No matter what I've tried, I've not been able to narrow it down to the exact spot where things go wrong: timing influences it. Single-stepping changes the behavior. Attaching commands to breakpoints (to dump variables, print backtraces) changes the behavior. Yet memory checkers (memprof and valgrind) report no reads or writes of freed memory.

Running it through xmon (an X protocol-monitoring proxy) changes where the problem occurs, but it still happens -- and nothing that xmon prints out looks out of place. In particular, the last GetProperty reply that comes through is totally sensible while on the wire, then somehow turns to shit by the time XGetWindowProperty() gets the result from _XReply()):

        REQUEST: GetProperty
sequence number: 033e
         delete: True
 request length: 0006
         window: WIN 00400020
       property: ATM 00000103
           type: AnyPropertyType
    long-offset: 00000000
    long-length: 00000001
          REPLY: GetProperty
         format: 00
sequence number: 033e
   reply length: 00000000
           type: <NONE>      <-- notably not 1
    bytes-after: 00000000
length of value: 00000000

Of course, I haven't actually been able to watch _XReply() perform this reverse-alchemical trick, because to do that, I'd have to know which of the thousands of calls to _XReply() was the one that was about to go wrong: because if I look at more than one of them, I throw the timing off, and the problem doesn't occur.

Attempting to make a small test case program was fruitless, for the same reason; I've not yet found a sequence of small-number-of-hundreds of events that cause this to happen reliably.

I'm just totally flailing at this point, changing things at random. If I could find a way to make it always die in the same place, I could start tediously binary-searching from there, looking at the contents of the read buffer, comparing memory dumps between subsequent runs, something. But instead I just keep running it over and over, watching it fail in a different place each time, and hoping an idea occurs to me.

I used to be good at this. I think someone stole my mojo.

Tags: ,

look, other people besides me still hate X too

This is a good rant about X.

Though of course it only scratches the surface.

Tags: , ,