scrmable

One of the memes making the rounds in the last couple days goes:
"Aoccdrnig to rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a total mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe."

I haven't found a link to the research (if there actually is any) but I did write a little perl script, scrmable, to test the hypothesis. It wroks pertty good!

I think the scrmabled words taht are least readable are the ones that end up with a lot of consecutive vowels, or that split the initial phoneme.

Update: March of teh Slashdorks!

Tags: , ,

44 Responses:

  1. altamira16 says:

    Aoccdrnig to a rscheearch at an Elingsh uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht frist and lsat ltteer is at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae we do not raed ervey lteter by it slef but the wrod as a wlohe.

    Clearly there is some university morphing going on in this study.

    • jwz says:

      Yeah, that sort of thing always happens with things like this.

      I wonder: did someone change it from Cambridge, in an attempt to add "clarity", or did they change it to Cambridge to add more fictitious authority?

      Either way, I'm betting there was no such "study", just someone noting a neat trick. (The difference between those? "Funding.")

      • caitlinburke says:

        There is data or at least reported experience, of varying quality, on this phenomenon in general. It is the basic idea behind the deprecated "see and say" or "whole word" method of teaching kids to read in a sort of scanning approach instead of taking the trouble to teach them to sound words out.

        White it's great that the brain can compensate so easily for imperfect use of symbols to represent words, this really only works for readers who already know the words. Unfamiliar words in languages that use alphabets can more easily be decoded by people who learn to read phonetically. (And I suspect that research would show that this is true for scrmabled words, too, since "see and say" learners confuse words with similar shapes more easily.)

  2. ralesk says:

    That said I have an acquaintance on one of the messengers who types similarly.  Not with this many letter-swapping, but to add to the problems, some spacing is in the wron gplace.  I'd say I'm pretty trained...

  3. etcetera5 says:

    had no trouble at all reading your post.. how great is that!

  4. king_mob says:


    # Premssioin to use, cpoy, mdoify, drusbiitte, and slel this stafowre and its
    # docneimuatton for any prsopue is hrbeey ganrted wuihott fee, prveodid taht
    # the avobe cprgyioht noicte appaer in all coipes and that both taht
    # cohgrypit noitce and tihs premssioin noitce aeppar in suppriotng
    # dcoumetioantn.

    You are the wind beneath my wings.

  5. kylec says:

    hah, I was waiting for someone to do a perl version, I did it last night in php :)

    http://junglist.org/jumble.php

    src: http://junglist.org/jumble.phps

    • phyxeld says:

      In the interest of multiple implementations, here's an abbreviated perl version, almost suitable for -e action:

      #!/usr/bin/perl -p
      # slcrambe.pl - by phyxeld. Inspired by http://www.jwz.org/hacks/scrmable.pl
      $A=50; s{(?< =\W\w)(\w\w+)(?=\w\W)}{ # A=0: no shuffle; A=100: all backwards;
      my $s; $s= rand(100) > $A ? qq|$s$_| : qq|$_$s| for (split //, $1); $s; }eg;

      same usage as jwz's, plus the addition of $A with which you can adjust how much it shuffles...

      • ronbar says:

        One-liner inverse bubblesorts like this are what drive control freaks toward python. Real one-liners don't use 'my'.

        Just to add my own control freak issue, the ? control operator (not the regex operator) should be illegal, or at least cause you to get lots of parking tickets.

      • jwz says:

        Get the fuck out of my house. I won't have that kind of talk in here.

  6. brad says:

    Love the code. :)

  7. stimps says:

    I can certainly read it. It just drives me insane to do so. It's WRONG, damn it! =)

  8. kiad says:

    I have written about 5 different applications, but none of them are useful or significant for me to commit to a lj comment.

    Did you come up with any?

  9. ronbar says:

    Of course a Lisp programmer would write a script to modify itself.

    Did you cheat and name the functions manually? Or did you write the script, then write another script to do a global shuffle and replace on function and variable names to scrmable.pl?

    I'm guessing you cheated; that would optimize for laziness. If you didn't cheat, I hope I don't have to look at any of your source code in the near future.

    • violentbloom says:

      at least he got rid of his lisp machines.

      • phygelus says:

        he got rid of his lisp machines? say it isn't so!
        say, how IS emacs on a lisp machine keyboard?

        • violentbloom says:

          I'm not sure they actually ever ran.
          besides I'm in the vi camp. :P I prefer less bells and wistles on my editor.

          So I just read the script to casablanca. It was TYPED on a typewriter in 1942 then I presume it's been photocopied to death for the last 60 years...it was a nightmare to read. First e turns into something that could be an o or maybe something else and lots of the charactor set like g in example get very hard to read with an old font set and bits missing (added ink blobs too) from photocopying. So you would think that if reading by shape helped with ignoring the misspelling (there was plenty of typos too) it wouldn't have been a big deal as the shape of most of the badly photocopied pages should be the same. But it made it quite difficult to read actually. I read at 690 words a minute which should put me in the read by shape camp, but now I wonder. I certainly didn't read that fast tonight. It took me an hour and 50 minutes to read 147 pages of script (which have more than normal amounts of blank space due to formating) so that's quite a lot slower than normal for me. It did matter that the details were not filled in.

        • jwz says:

          I gave them to Noah, because he was likely to provide them with a more nurturing home than I. And yes, they worked, though something new tended to go wrong with the hardware each time I powered them up... They were not made to last.

          Lispm Emacs was ZMACS (ZWEI), and it was great: nearly every user-visible feature I added to Lucid Emacs (font-lock mode, active regions, etc.) was cribbed from there.

          I wonder how the Explorer emulator project is going... (Page last modified April 2002, says it's not booting all the way yet.)

    • jwz says:

      Hey, I only "cheated" in that, if you scrmable the Perl reserved wrods, it doesn't run so good.

      • ronbar says:

        I was going to say that modifying the script a bit to change only non-reserved words would be easy except for ignoring reserved words, but then I realized how big an exception that is. So I think you should write it in Python instead. Python makes everything all better. And write the web version in PHP and use procmail to filter those evites.

        • Rather than teaching the script how to parse Perl, you could teach the script to ask Perl how the text will be parsed, and if a particular permutation changes the meaning of a text, undo it. The tricky part is making this efficient for large input texts.

          And then you have a source code shrouder. Another interesting constraint is to make it reversible if you know the seed for the PRNG.

  10. greyface says:

    The problem with the article is this phrase "The rset can be a total mses and you can sitll raed it wouthit a porbelm." Without problem does not solely imply "with success." The fact is, it takes longer, and more energy to read things that are less correct.

    Cognitive Scientists (I have a degree in it, but am not really one of them) can explain it all pretty easily. Research shows that people decide what letter a given inscribed symbol is, by the inscribed symbols around it... so it wouldn't surprise me if the same went for words.

    There's also all that research that shows that after sufficient training in reading, people read words when they see them, even if they are trying not to. Anyway, it's cute, and funny, but far from ground-breaking science.

    The script however, is genius... if geniuses HATE ALL LIFE.

    • Funny, people never believe me when I tell them I have no choice but to read subtitles if they are present. (And then I notice differences between the subtitles and the audio, which drive me nuts.)

    • pne says:

      There's also all that research that shows that after sufficient training in reading, people read words when they see them, even if they are trying not to.

      Probably different for dyslexics (who, as I understand it, read "differently" from other people anyway) - my wife says she can look at words without reading them unless she chooses to. I think I can't do that.

      • pthalogreen says:

        i am not dyslexic (But i had to look at your comment in order to spell it), and I can generally choose not to read words if I see them, but it takes conscious effort to choose not, rather than to choose to.

        what i can't seem to do, is listen to kamilla talking in english while trying to read in Hungarian. or to read in hungarian while listening to english music.

      • soleklypse says:

        There's something called the Stroop effect where they show you words like "red" or "green" but write the word "red" with green ink, etc. and ask you to name the color ink. Everyone (with rare and highly notable exceptions) finds it much more difficult to name the color of a word that says a color other than the one it's written in. In other words, if you see "car" written in green, you will say "green" much faster than if you see the word "red" written in green. This effect is one of the most well documented and repeatable experiments in social psychology. I don't know if they've tried it with dyslexics though.

  11. abates says:

    Meanwhile a hundred internet spelling pedants are having seizures.

  12. scjody says:

    < stewie> Email from dan: 16:34

    Aoccdrnig to a rscheearch at an Elingsh uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is that frist and lsat ltteer is at the rghit pclae.

    The rset can be a total mses and you can sitll raed it wouthit porbelm.

    Tihs is bcuseae we do not raed ervey lteter by it slef but the wrod as a wlohe.
    < mostafah> bihullst 16:38
  13. enf says:

    Have you seen the letter from Chuck Moore when the Forth standards committee proposed matching against entire tokens instead of the first three characters and character count?

    It begins:

    Dea- Edi---

    I am afr--- tha- the let--- in the las- iss-- abo-- for-- inc- usi-- onl- thr-- let--- nam- fie--- has had the opp----- eff--- fro- wha- the wri--- wan---

    (from the second ACM History of Programming Languages conference proceedings)

  14. l2g says:

    As words get lngeor, it bomeecs icsiarelgnny dfflciiut for one's mnatel fcliaetus to ctllpomeey irgnoe the agmmntciaaaarl oufotsacbin. Eeatvllnuy it bmceeos an eeexigncldy hoodunrres chagnelle (touhgh not an ugnlltcceioraay ilrcaanbtte one).

  15. Now all we need is comilers recognize code written this way.

  16. l2g says:

    --- scrmable.orig.pl 2003-09-15 08:59:09.000000000 -0700
    +++ scrmable.pl 2003-09-15 11:05:21.000000000 -0700
    @@ -27,9 +27,7 @@
    my $Z = pop @w;
    print $A;
    if (defined ($Z)) {
    - my %tt;
    - foreach (@w) { $tt{$_} = rand; }
    - @w = sort { $tt{$a} <=> $tt{$b}; } @w;
    + @w = sort { rand() <=> rand() } @w;
    foreach (@w) {
    print $_;
    }

    Doesn't change the functionality, just makes it a tad clearer and shorter.

    • jwz says:

      sort algorithms are likely to misbehave if you change the sort key partway through.

      Though I see the latest Perl FAQ recommends yet a different way.

      • l2g says:

        Do you mean "misbehave" in a way other than "not get the sort order right"?

        • jwz says:

          What other way would there be for a sort algorithm to misbehave? Changing the key could screw up the O(n) nature or could leave big pieces unsorted, depending on the implementation. Since we're only sorting ~8 elements at a time, it's the last one that matters.

          • l2g says:

            Ah, I see. Since randomizing is the point, I was confused as to why a flaw in the sort order would matter. :-)

            Thank you, O wise guru!

  17. hotabay says:

    I can read it, but it gives me a headache. Makes me feel dyslexic.

  18. kiad says:

    So, this was brought up in my Complexity class. They wondered about a senario:

    Would the Declaration of Independance still be recognisable or even perhaps readable (from memory?) if the first and last letters of every word were intact, but random letters were inserted between the first and last.
    I can't believe this is actually a class topic. Insanity.

  19. sidelobe says:

    Ya really gotta wonder if this could be a new avenue for spelling checkers and spelling fixers. Add the function to fix misplace dspaces and you'd really have something!