hell yeah


71 Responses:

  1. carus_erus says:

    Oh hell yea.

    I want to take this and hand it out to the Cognitive Science students at my old university.

    This looks like a game of "1000 blank white cards" actually.

  2. caprine says:

    Bwahahahaaaaa! Excellent!

  3. ladykalessia says:

    This one explains why geek boys often have so much trouble in the dating world.

    Well, that, and poor personal hygiene.

    • The square root of love is an imaginary number. The typical geek boy just isn't normally that socially complex, so love is at right angles to his worldview.

      (And yes, unfortunately, so is soap.)

      • ladykalessia says:

        Bother. I always did have trouble groking imaginary numbers.

        • If a boy ever tries to explain how he feels for you by graphing it on an Argand diagram...

          ...then I suggest you run, do not walk, to the nearest mental sanitation facility.

          [As an aside, the caption for this particular graph includes the phrases "ideal resonance", "elastic scattering process" and "all points X must lie within or on the unitarity circle". I think this particular physicist does not have gentlemanly intentions at all.]

          • duskwuff says:

            This graph needs more σs.

            (If you don't get it, you probably haven't read Cryptonomicon, and/or you don't have a sufficiently dirty mind.)

            • ladykalessia says:

              That, or, as in my case, you're a theater major and have had no truck with physics since high school. (Crap, I think I'm going to have to go back and re-read Cryptonomicon again.)

              (Unless you count the stuff required for rigging or keeping burning things from falling on oneself, but that's considerably less theoretical.)

              • duskwuff says:

                This one's actually entirely non-technical, no matter what the text may suggest... here, I'll quote.

                Waterhouse seeks happiness. He achieves it by breaking Nip code systems and playing the pipe organ. But since pipe organs are in short supply, his happiness level ends up being totally dependent on breaking codes.

                He cannot break codes (hence, cannot be happy) unless his mind is clear. Now suppose that mental clarity is designated by Cm, which is normalized, or calibrated, in such a way that it is always the case that

                0 ââ

      • valacosa says:

        Great. Do you have any idea what the Fourier transform of love would be?

        (Would it give you the frequency of...nevermind.)

  4. evan says:

    (I have a degree in linguistics so I can claim to know something!) Since computational linguistics is mostly about attempts at models to achieve some ends beyond the models, dozens of contradictory models is fine. Nobody complains about machine learning people trying different ML approaches for the same tasks.

    Linguistics, on the other hand, and particularly syntax and sematics, is mostly people just making shit up.

    • jwz says:

      I spent a bit over a year working in a C.L. group at UCB, and I feel I can say with confidence that the word "bullshit" was invented to describe the entire field. For two words, I'd go with "intellectually corrupt". For a sentence, I guess I'd go with "we don't have to actually believe in our research as long as DARPA keeps giving us money". As far as I could tell, there hadn't been a single advance in the field since the mid 70s, and nobody involved was even remotely interested in solving any actual problems (unless that problem was "how do I parrot out enough words to get my PhD and get the hell out of here without producing a single working line of code".)

      So, yeah, that cartoon has a certain resonance for me.

      (Oh, and has Cyc woken up yet? Hahahahahahahahaha.)

      • I read the first phrase there as as "I spent a bitter year"... anticipating the rest of the paragraph perhaps :)

      • blackavar says:

        Heh. They hadn't woken up the last time I dealt with them, but mentioning the name was always a ticket to a few million more in free govt/DHS/TIA money, at least as of a few years back. Honestly, they don't have to wake up, as long as money keeps getting shoveled at them - their approach is apparently working.
        Now actually getting the KB to work in any application that stresses it at all, even after the rewrite? That's interesting, in the Chinese curse sense.

        • jwz says:

          Um. By "has woken up" I did not mean in the sense of "producing useful results", but rather in the sense of "Skynet".

          • blackavar says:

            Ahh, "has achieved sentience". In that case, the laughter is understandable.
            While we, for some might welcome the robot overlords, if they're anything like the Cyc KB, we've got a good long wait ahead.

          • blackavar says:

            To clarify - I was referring to "has woken up" in the sense of "realizing that they're not going to get this sytem to function anywhere near fast enough to do what they're promising to DARPA, and making some sort of decision based on that", or on DARPA/DHS/TIA's successor/NSA/NRO/etc coming to their senses and counting the billions they've spent for essentially no useful results except in highly specialized cases. Clearly far too optimistic a hope.

      • jonabbey says:

        Some years ago, CycCorp laid off a bunch of workers. Our lab is located about half a mile away from MCC/CycCorp, and we received lots of resumes from those guys, all of which extolling their skills in 'Ontological Engineering'.

        Basically, entirely useless to us.

      • luserspaz says:

        I remember talking to one of the guys at a DARPA conference, and he actually had a graph of CPU power vs. time, accounting for speed improvements, multiple cores, etc, and there was a point on his graph at which point AI "happened".

        It was ok, I thought the whole Cyc concept was ridiculous the first time anyone explained it to me anyway.

        • jwz says:

          To be fair, if you see Kurzweil speak in person, he can drag you along that curve pretty easily. He starts off small and easily belivable, and 30 smooth minutes later he's saying things like (numbers elided to compensate for hazy, drunken memory) -- "so in 10 years, we have CPUs with 1eN bits, which is enough power to model the data output of every neuron in a human brain, so we get uploaded immortality without needing to actually understand the brain at all. Of course, some people say modeling the neurons isn't enough, and you'd also need to model every sub-cellular chemical interaction. That'd be 1eN+3 bits. Advancing to that level along the curve takes an additional six months."

          I have this fundamental internal conflict between my credulosity toward the concept of the singularity, and my contempt for AI researchers.

          • luserspaz says:

            Yeah, I did read "The Singularity is Near" and found his argument pretty compelling, but he at least admits that we need some understanding of the brain to emulate if, even if we have ridiculous computing power.

            If CycCorp was right, then Wal-Mart's data mining facility would have achieved sentience a while ago anyway.

          • It's so neat the way the first part of a sigmoid/logistic curve tracks an exponential, isn't it?

          • jwjr says:

            Two points in Kurzweil's favor:

            He has actually, more than once in his life, built successful products based on his know-how and insights about what was newly technically feasible.

            His shtick is sold as entertainment / education -- not, AFAIK, marketing material to raise funds for a gigantic, amorphous research program.

            • Well, sure, but Cray did just fine commercially with products the details of which he assured people were conveyed to him by the gnomes in the tunnels he dug in his back yard.

              "It sells" does not imply "the guy who designed it is sane."

              • jwjr says:

                Kurzweil's various successes in designing and selling newly feasible products do not prove he is sane. They merely prove that at various points he knew a thing or two about new technologies and had sufficient drive and imagination to see his projects through. This does, however, mightily distinguish him from the fantasizers who could not achieve anything practical if their lives depended on it yet still see fit to opine on the future history of technology as though anyone should care what they think.

          • dr_memory says:

            "enough power to model the data output of every neuron in a human brain"

            ...assuming an accurate model. Ho ho ho, there's the rub, rotsa ruck, has an AI researcher ever been seen within 50 yards of a neurobiology lab etc -- erring on the side of contempt seems like the safe bet in this context.

            • greyface says:

              Welcome to Cognitive Science.

              They're not AI researchers in a way that Computer Science Departments recognize... specifically because they have this irritating habit of talking to psychologists and neurologists and saying, "How does this compare to your human-reserach?"

        • blackavar says:


          I am reminded of an old OMNI cartoon:

          (blackboard full of equations)
          "And then a miracle happens!"
          (blackboard full of equations)

      • I don't know, NLP has been used effectively to make telephone support even more painful.

      • sps says:


        But it's all so much more complicated than that.

        On the one hand, the establishment controls the funcding, and the establishment wants intelligence-gathering tools (and possibly phone-answering systems), and definitely not an improved understanding of the mechanisms of culture.

        On the other hand, theoretical linguistics, which should be driving matters, has (in North America) wilted under the cult of Chomsky.

        But I worked at the University of Chicago, and I worked at the DFKI Saarbruecken. I have little bad to say about either - for the most part - or about the people I met from Edinburgh.

        ...And then again, more recently I did streaming media 'research' at McGill University and - again this is down to personalities and priorities - I would score that project higher on funding than integrity, too.

    • strspn says:

      So, how come that commercial MT is only about as good as it was ten years ago? Seriously, I have a MT (English-to-Spanish) program I bought off the shelf in 1994, and it works just as well as Google Translate and present-day Bablefish. They must all share a heritage, too, since all three make the exact same mistakes.

      Computers have gotten mush faster, memory and disks have grown exponentially, software engineering techniques have improved, somewhat. Why isn't MT any better?

      Plus, computational linguistics papers are the same as they were twenty years ago: one tiny micro-domain of theory per paper, averaging about a dozen actual examples; very few corpus-based approaches -- probably about the same proportion. And the corpuses (corpi?) haven't gotten any bigger.

      I'm with the cartoonist.

      • evan says:

        I guess the sort of comp. ling. I'm thinking of is almost a separate field from the one you and Jamie describe. The interesting work (and maybe it's just because I'm surrounded by a certain sort of people) is very much corpus- and statistics-based. I think the cheapening computing has really changed the field.

        Regarding MT, my impression was that there was no economic pressure to improve the state of the art of MT. Hopefully Google is changing this:
        Franz (the researcher behind this) bristles if you use the word "linguistics" around him, probably in no small part due to the negative association many have with the term.

        The actual translations still leave a bit to be desired, though. ("It's hard work.")

      • shephi says:

        Google's Arabic and Chinese translation is statistical based annd in house, for most of their other languages Google outsources to the commercial Systran system which has been around since the days when you bought your commercial MT system, a system likely to share Systran's successes and flaws. For especially popular documents, Google is said to employ human (gasp?) translators. You should look to Google for some new advances; they've employed the talented Franz Och to lead the way.

        The corpora have gotten bigger: consider the EU, formed in 1993, which by law has to translate its parliamentary data into each member country's official languages (and spends billions of dollars per year employing human translators to do so). This parallel corpus, freely available, and others like it, are new burgeoning resources for the MT community.

        MT has gotten better: given a decent size parallel corpus, in a few days one can build a statistical machine translation system in an obscure language pair (e.g. Finnish-Greek) which approaches the quality of commercial systems, limited to well funded language pairs, developed rule by rule over the space of decades.

        Not to say there isn't a chunk of BS in the flotsam; today I heard a fancy talk from a cognitive linguist. The best post talk comment from the computational linguists was "is there actually any data which shows your model of the brain to be any more accurate than positing that the human brain does all its processing in C++?" The speaker stammered his tensor products down his throat.

        • jwz says:
            "the human brain does all its processing in C++?"

          Nonsense! Everyone knows it's all hydraulics, driven by the pressure of sex-goodies-icecream!

        • kehoea says:

          "is there actually any data which shows your model of the brain to be any more accurate than positing that the human brain does all its processing in C++?"

          That's a very un-Computational-Linguist question to pose; are you sure it wasn't a student or a CS professor who wandered in by mistake?

          • I think that's the point. The question forces the computational linguist to step outside the bullshit for a moment. Or failing that, at least points out the big steaming heap.

            • kehoea says:

              I got the point; you missed mine, I think. See, my experience of Computational Linguists (one of whom posed that question, according to our correspondent from Massachusetts) is that they are not in a big hurry to point out any steam from anything recently-expelled. Which makes lots of sense, because pointing out to your colleagues that they're full of shit does not make for tenure nor for a good working relationship.

    • kehoea says:

      Since computational linguistics is mostly about attempts at models to achieve some ends beyond the models, dozens of contradictory models is fine.

      Not when there's no pressure to select for the more correct models, it isn't. It's like the approach of the granddaddy of taking money from the DoD and doing things of negative value for it-Noam on Corpus Linguistics, "True sciences like physics, chemistry, and biology don't just collect lots and lots of data,"-fuck the data, this model has a superficial elegance I like!

      Linguistics, on the other hand, and particularly syntax and sematics, is mostly people just making shit up.

      Syntax and semantics are people making shit up; corpus linguistics, diachronics, descriptive fieldwork, descriptive atlases of English dialects are mostly not.

      My BA was more or less in computational linguistics; when and if I do a masters, it won't be, because I fear JWZ is mostly correct in his judgement of the field. I would love to work with Michael Tomasello.; he seems to be doing what the CL/Chomskyan people should have been doing long ago, and doing it well. Excuse the repost; I got the Chomsky quote wrong.

  5. duskwuff says:

    Syndicated at <lj user=xkcd_comic> if you haven't noticed already.

  6. der_die_das says:

    I doubt that any one researcher can subscribe to a set of contradictory models at the same time and still be taken seriously. The field as a whole has produced such contradictory models -- just like about any other science as well. So where's the problem?

    (posting from the Annual Meeting of the Association for Computational Linguistics 2006, incidentally)

  7. spike says:

    Clearly, it's some kind of VC-money-laundering scheme.

    • relaxing says:

      Sorry, Lexxe has just experienced Internet connection problem. Please try a few minutes later. Thank you for your cooperation.

    • 205guy says:

      That name-logo thing (containing no spaces, four colors, three font sizes, one made up word of ambiguous pronunciation and a non-occurring letter combination, one abbreviation, and one geek-speak) is the most unnatural verbiage I've seen in while. How do they expect people to take them seriously? Not to mention the idea of releasing something alpha.

      If I were them, I would've called myself "Answers to your questions.com"

      By the way, I feel compelled to copyright 2006 and reserve the use of "Answers to your questions" in relation to search engines, if it hasn't been done already.

      • 205guy says:

        I had to go and try: Lexxe can't answer the question "What does lexxe mean?" unless by "find answer" they meant "make me sift through pages of obviously word-based search results with excerpts that may or may not contain an answer." You'd think that search engines would somehow figure out from where the user clicks which of the results page contain what he or she was looking for--and then use that to present better results next time. Hey, can I apply for a patent on that, too?