Tag Archives: archive

digital preservation

I have been an active member on the Stack Exchange family of sites [nearly] since StackOverflow started a few years ago.

Recently a new proposal has been made for Digital Preservation. Many of the proposed questions are interesting (including one of mine) – and I would strongly encourage anyone interested in the topic to check it out.

The topic has resparked a question I have had for a long time – why is it important to archive data?

Not that I think it’s inherently bad to hold onto digital information for some period of time – but what is the impetus for storing it more-or-less forever?

In tech popculture we have services like Google’s gmail which starts users at a mind-boggling 7+ gigabytes of storage! For email! Who has 7GB of email that needs to be stored?! For a variety of reasons, I hold onto all of my work email for the duration of my employment with a given company – you never know when it might be useful (and it turns out it’s useful fairly frequently). But personal email? Really? Who needs either anywhere near that much, or to hold onto it for that long? And those few people who arguably DO need that much, or to keep it forever, can afford to store it somewhere safely.

I think there is a major failing in modern thinking that says we have to save everything we can just because we can. Is storage “cheap”? Absolutely. But the hoard / “archive” mentality that pervades modern culture needs to be combated heavily. We, as a people, need to learn how to forget – and how to remember properly. Our minds are, more and more, becoming “googlized“. We have decided it’s more important to know how to find what we want rather to know it. And for some things, this is good:

If you are a machinist, is it better to know how to reverse-thread the inside of a titanium pipe end-cap, or to go look up what kind of tooling and lathe settings you will need when you get around to making that part? I suppose that if all you ever do in life is mill reverse-threaded titanium pipe end-caps, you should probably commit that piece of information to memory.

But we need to remember to forget, too:

when you need to make two of these things. Ever. In your entire life. In the entire history of every company you ever work for. Well, then I would say it’s better to go look up that particular datum when you need it. And then promptly forget it.

The historical value, interest, and amazing work that is contained in the “Domesday Books” is amazing – and something that has been of immense value to historians, archivists, politicians, and the general public. Various and sundry public records (census data, property deeds, genealogies, etc) are fantastic pieces to hold onto – and to make as available and accessible as possible.

Making various other archives available publicly is great too (eg the NYO&WRHS) – and I applaud each and every one of those efforts; indeed, I contribute to them whenever I can.

I continuously wonder, though, how many of these records and artifacts truly need to be saved – certainly it is true of physical artifacts that preservation is important, but how many copies of the first printing of Moby Dick do we need (to pick an example)?

I don’t know what the best answer is to digital hoarding, but preservation is a topic which needs to be considered carefully.