We're slowly groveling through Terabytes of unsorted archival storage at CSAIL — mainframe and server backups going back to the 1970s and earlier.
Raw bits will shrink as we redact private and duplicate data but a full text search index may grow big. If you have experience with long term storage of any and all digital media, please share what you know on making them last.

We loaded approximately 12 tons of magnetic tape for processing at a state-of-the-art data recovery facility. We are studying tape images produced by this process.
We need OCR software to read photos of printed and hand written tape reel labels and software to read punched cards with a sheet-feeding scanner.
We are scanning the handwritten dump logs of archival backups from the AI Lab, the Lab for Computer Science, Project MAC, the Free Software Foundation's GNU project, ... What do they have in common? They all happened in MIT building NE43 aka Tech Square, where computing history was made for many decades before we moved in 2004 to MIT building 32 aka the Stata Center.

Once MIT's lawyers agree, publication might go something like this: Authors sign a form to verify their identity and their legal right to authorize publication of their files and get a password to something like this sample web interface. Authors browse their files and mark some as public, we mail a manifest, they confirm and we publish on our internal server and later perhaps in DSpace and/or the Science Commons and/or the wonderful wayback machine.

