Friday, December 4, 2009

Preserving Data in a Digital Format Forever

Henry Newman, in his article "Error Correction: An Urgent Need for Files" (, December 3, 2009), proposed one method for the preservation of information in a digital format in a way that would better withstand corruption.

I had another idea though. Often with optical filmstrips, by examining any part of the tape, you can tell what the format of that tape is. What if the same were true of files? Throughout the file (before every frame of a film and before every chunk of audio), the type of compression and format could be noted in a standardized way. If this were the case, then there would be no problem with some of the bits being mangled here and there. Not very efficient for sure, but this sort of duplication would ensure that even a severely corrupted file could very likely be read, by looking at frame headers that either are readable or by using the average of the bits in frame header to produce a good or "likely" type for the media.

