The recordings exist in a variety of historical formats, including wax cylinders, 78 rpms, and LPs. They span musical genres including classical, pop, rock, and jazz, and contain obscure recordings like this album of music for baton twirlers, and this record of radio's all-time greatest bloopers. [...]
Once cataloged, the LP's are then digitized. The Internet Archive partners with Innodata Knowledge Services, an organization focused on machine learning and digital data transformation, to complete the digitization process at their facilities in Cebu, Philippines. An Innodata worker digitizes 12 LPs at a time, setting turntables to play and record by hand, then turning each record over to the next side. Since each LP is digitized in real time, it takes a full 20 minutes to record an average LP side. By operating 12 turntables simultaneously, the team expects to be able to digitize ten LPs per hour.
This is awesome. However, as someone who has put many hours into manually digitizing records using what I think is the very same turntable pictured (they look like Panasonic 1200s to me) this idea that you can just fire-and-forget seems... fantastically optimistic. Perhaps the records in the BPL collection are all of a never-been-played, white-glove level of archival quality, but if these records have ever been owned by a human... no. No, that's just not going to work. It's going to be a wobbly mess of oscillating playback speeds and skips.
Decades ago I read about someone's project to rip vinyl by scanning the disc on a flatbed scanner and then processing the image. Did that ever go anywhere? That would be kind of analogous to the way IA is archiving old floppy disks with flux scans that record a ludicrously high resolution image of the analog waveform rather than the bits comprising the file system.