Internet Archive's book scanning

The glass rises and falls. Quickly and efficiently, a woman turns the pages to the rhythmic beep of the cameras.

Some books, like the Bureau of Land Management publication featured in the video, have myriad fold-outs. Eliza must insert a slip of paper to remind her to go back and shoot each fold-out page, while at the same time inputting the page numbers into the item record. The job requires keen concentration. [...]

Eliza is one of about 70 Scribe operators at the Internet Archive, working in digitization centers embedded in libraries across the United States, United Kingdom, and Canada. [...] "We try to meet libraries where they are," said MacLeod, who manages remote operations from her home office in North Carolina. "From digitizing a few shipments a year at one of our regional centers to setting up and staffing full-service digitization within the library itself, we have a flexible approach to our library partnerships."

Across Twitter, another common question arose: "Why hasn't this job been automated?" To many, the repetitive act of turning the pages in a book and photographing them seems like the natural task for a robot. In fact, some 20 years ago, we tested commercial book scanners that feature a vacuum-powered page-turning arm. It turns out those automated scanners didn't really work well for brittle books, rare volumes, and other special collections -- the kinds of material our library partners ask us to digitize.

"Clean, dry human hands are the best way to turn pages," said Mills, from her socially-distanced office at the University of Toronto. In her 15 years on the job, she has worked with hundreds of librarians to hone our digitization operations, balancing our need to preserve the original pages with minimal impact during the imaging process. "Our goal is to handle the book once and to care for the original as we work with it," Mills explained.

By the way, please note that the Internet Asshole Vats have already overflowed with insights ranging all the way from "Pfff, I could do that with lego" to "lol get a REAL job", and here are a couple of Jason Scott's threads that address these observations: "why u no use robot", "is book scanning job bad?" and more on lego.

The Archive is a treasure, give them your money.

Previously, previously, previously, previously, previously, previously.

Tags: , ,

17 Responses:

  1. Dude says:

    Yet another reason why I trust the Internet Archive over any of Sergey Brin's bullshit, copyright-blurring experiments.

    I actually just learned about these machines the other day when I reviewed the upcoming film The Book of Vision for SF Indiefest. The one in that film (used in a European medical museum) was much more narrow, kinda like this DIY version: ...but fascinating to watch, none the less.

  2. kellyu says:

    They're not wrong on the need for delicacy - I have a few books in the law firm library collection I run that are only a bit over 200 years old, and they're too fragile to use a photocopier on.

    You need to gently pry the book open to the correct section, and carefully photograph each page without having the book open flat.

    And you will always get decayed leather crumbs on you, to the point that your fingers will turn orange and you'll probably stain your shirt. We probably should use white cotton gloves, but it's an active collection in a law firm, so that's just not happening.

    Some books don't have to be particularly old to be delicate; anything published from about 1942 onwards through to the end of WWII was probably printed on poor quality wood pulp paper and has thus turned incredibly acidic and become flaky and delicate.

    • Thomas Lord says:

      I'm told by someone who works in libraries adjacent to some priceless archive collections that the gloves have fallen out of use and are no longer considered best practice. Something about the coarseness of the fabric doing more harm than good. Clear, dry hands is apparently the modern standard.

      • Chris says:

        From what I've gathered, the main issue is that wearing gloves reduces both dexterity and sensitivity, making mishandling more likely. It's also not necessarily true that gloves are any cleaner than properly washed hands, as the fabric can easily pick up dust and dirt from other surfaces, as well as absorbing oils and liquids, including sweat wicked from cloth-covered hands.

        • tfb says:

          The whole white-cotton-glove thing is often done as a performance, I think.

          I spend a lot of time (or, in fact, spent a lot of time pre CV19) making prints from film, and one of the things 'serious' people do is wear white cotton gloves to handle negs. Unless you use a fresh, carefully-stored, pair for each neg the result of that is that they pick up crud and turn into sandpaper, slightly doped with the chemistry. And using a new pair for every neg is ... not practical if you're making more than a tiny number of prints. Instead you just wash your hands (which you're doing anyway since they've just been in the chemistry) dry them really carefully and you're fine.

          • jwz says:

            One thing the plague has taught me is that wearing gloves or a mask has the psychological effect of making me not touch my face, so I can imagine gloves might help a bit purely as a ritual.

            • tfb says:

              Yes, I think so. When making prints I certainly do rely on rituals (rinse & dry my hands if I've even stood in the wet side of the room & various others). Just not gloves in that case. But even in outside spaces I wear a mask if there are other people because it reminds me to be more careful.

      • Carlos says:

        On the other hand, my wife is a conservator and works with old and delicate objects, including books. Wearing gloves can be a very good idea for other reasons - she once got a painful, difficult-to-cure fungal infection that crawled up her arms from working on a leather-bound book that had been stored in damp conditions.


  3. nooj says:

    Does the Internet Archive still take floppies and email you a download link to the contents? I looked on their website a few weeks ago, but couldn't confirm or deny.

  4. Anonymous says:

    If Brewster were to happen upon these comments and wanted to open another scanning center in the comparatively cheaper Austin area, and you allowed for ridehail-style, make-your-own-schedule gig workers to come in and burn off some time—provided they've been trained and remain in good standing—that'd be swell. I know I've seen people across the net express interest in scanning for, and if this kind of program were available, I'd definitely spend dozens (probably hundreds) of hours there myself, bring in books of my own, etc.

    • Bookie says:

      Why would you want to boot a scanner out of a job?

      • Anonymous says:

        Is that what you say when you go work somewhere—what about the person who would otherwise get this job this if I didn't apply?

  5. Lloyd says:

    Vernor Vinge's Rainbows End does not come off well in the book-scanning department.

    Still, it's not as if science fiction writers can predict the future, or anything.

  • Previously