Internet Archive Fire Shows Vulnerability Of The World's Online Memory
from the taking-things-for-granted dept
The Internet Archive is a jewel of the digital world:
The Internet Archive is a 501(c)(3) non-profit that was founded to build an Internet library. Its purposes include offering permanent access for researchers, historians, scholars, people with disabilities, and the general public to historical collections that exist in digital format.
Founded in 1996 and located in San Francisco, the Archive has been receiving data donations from Alexa Internet and others. In late 1999, the organization started to grow to include more well-rounded collections. Now the Internet Archive includes: texts, audio, moving images, and software as well as archived web pages in our collections, and provides specialized services for adaptive reading and information access for the blind and other persons with disabilities.
Here’s the amazing scale of the project today:
The Internet Archive Wayback Machine contains almost 2 petabytes of data and is currently growing at a rate of 20 terabytes per month. This eclipses the amount of text contained in the world’s largest libraries, including the Library of Congress.
The Internet Archive is the world’s online memory, holding the only copies of many historic (and not-so-historic) Web pages that have long disappeared from the Web itself.
This morning at about 3:30 a.m. a fire started at the Internet Archive’s San Francisco scanning center.
Good news:
no one was hurt and no data was lost. Our main building was not affected except for damage to one electrical run. This power issue caused us to lose power to some servers for a while.
Bad news:
Some physical materials were in the scanning center because they were being digitized, but most were in a separate locked room or in our physical archive and were not lost. Of those materials we did unfortunately lose, about half had already been digitized. We are working with our library partners now to assess.
That loss is unfortunate, but imagine if the fire had been in the main server room holding the Internet Archive’s 2 petabytes of data. Wisely, the project has placed copies at other locations:
We have copies of the data in the Internet Archive in multiple locations, so even if our main building had been involved in the fire we still would not have lost the amazing content we have all worked so hard to collect.
That’s good to know, but it seems rather foolish for the world to depend on the Internet Archive always being able to keep all its copies up to date, especially as the quantity of data that it stores continues to rise. This digital library is so important in historical and cultural terms: surely it’s time to start mirroring the Internet Archive around the world in many locations, with direct and sustained support from multiple governments. They can also help provide the Internet Archive with a wider, more international range of content, to make an even more representative store of the world’s digital activity.
Unfortunately, that’s not likely to happen anytime soon, as people seem happy to take for granted the amazing work of Brewster Kahle and his team. The next best thing would be to donate so that they can continue with their indispensable project — and perhaps create a few more backup copies.
Follow me @glynmoody on Twitter or identi.ca, and +glynmoody on Google+
Filed Under: culture, fire, history, internet archive, preservation
Companies: internet archive
Comments on “Internet Archive Fire Shows Vulnerability Of The World's Online Memory”
One Of Their Redundant Sites Is In Alexandria, Egypt
The same Alexandria that was the site of the Great Library back in Classical times. The same Library where Hypatia, daughter of Theon, was in charge. The same Hypatia that was put to a horrible death by the Christians who didn?t hold with all that pagan learning. Even more unnatural if it came from a woman.
Some things haven?t changed that much…
Re: One Of Their Redundant Sites Is In Alexandria, Egypt
Neglected a relevant fact or 2 did we?
“Upon hearing of this, Cyril threatened the Jews of Alexandria with “the utmost severities” if harassment of Christians was not ceased at once. In response, the Jews of Alexandria grew only more furious over Cyril’s threat, and in their anger they eventually resorted to violence against the Christians. They plotted to flush the Christians out at night by running through the streets, claiming that the Church of Alexander was on fire. When the Christians responded to what they were led to believe was the burning down of their church, “the Jews immediately fell upon and slew them”, using rings to recognize one another in the dark, while killing everyone else in sight. When the morning came, the Jews of Alexandria could not hide their guilt, and Cyril, along with many of his followers, took to the city?s synagogues in search of the perpetrators of the night’s massacre.”
I think it was maybe her defense and counsel of this slaughter, and not her learning, that caused her to be killed. But way to spread the hatred.
Re: Re: One Of Their Redundant Sites Is In Alexandria, Egypt
Blaming the Jews?another old Christian tradition…
Re: Re: Re: One Of Their Redundant Sites Is In Alexandria, Egypt
no Christians blame everyone, and if someone puts blame on them they just say “God made me do it”.
Re: Re: Re: One Of Their Redundant Sites Is In Alexandria, Egypt
It’s not a specific group that’s being targeted in statements like this; PRMan was taking a pop at authoritarianism. It just so happens that certain Jewish activists were attacking Christians in the same way and for the same reasons as the Roman Catholic Church attacked “heretics.”
It was the same idea: Christians were a heretic group, to their minds, and they hit them for that.
What I’m saying is, nobody is hating on anybody, Lawrence D’Oliveiro. We’re just stating facts and letting them speak for themselves.
Re: One Of Their Redundant Sites Is In Alexandria, Egypt
There is no historical basis for the idea that the library at Alexandria was burned, by Christians or anyone else. Hypatia was indeed killed by a Christian mob, at the tail end of decades of back and forth violence that saw many killed on both sides. Contrary to popular belief, Hypatia was not a significant figure in the history of thought. Doesnt mean she deserved to die, but rather that history doesn’t need to be abused and rewritten just because you personally dislike Christians in the 21st century.
It is a shame that the U.S. government doesn’t have a massive data center that has been intended to be used for collection and storage of its citizens private data — in violation of the U.S. Constitution — and instead could be re-purposed by Congress for archiving of such historic public collections.
Re: Re:
Given Thor, Zeus, and every other lightening related deity seems to have a grudge against that building, I’m not sure putting a highly valuable backup system there would be the wisest idea.
Seems like this may be a good use for a program such as sync torrent, where people donate portions of their own personal storage, (who can’t spare a terabyte or so these days?) with lots of redundancy?
This way, backups of the library are spread around the world.
wow 20TB per month !!!!!
is it just me, or other that think 20 terabytes a month seem VERY small !!
Re: wow 20TB per month !!!!!
Bigger than your brain, darryl.
Can you imagine how much data they’d have if the NSA started donating!
“Internet Archive Fire Shows Vulnerability Of The World’s Online Memory”
No, it shows that people that actually care about preserving history are smart enough to have multiple backups. It shows that the Internet archive is in good hands.
These aren’t the guys that wanted to “preserve” IsoHunt just so they could open a few clones of the site the very next week.
this is the modern day equivalent of what happened to some many records of bygone ages. but they is far more concern with copyright than with preserving anything. i’m trying to figure out how having a copyright on something is going to preserve it and enable it to be reproduced if necessary. i think that once it’s lost, it’s lost. look what happened to Atlantis.
Re: Re:
Atlantis?
http://classics.mit.edu/Plato/timaeus.html
It’s allegorical fiction.
But the Internet Archive sucks balls now
Seriously, have you used it at all these past few years? It sucks, its usefulness has passed. Do to its overzealous enthusiasm to please the ever powerful Copyright Gods, and the fact that the coders are all lazy good-for-nothings, the wayback machine and search functions have been crippled with retroactive content removal. They allow current owners of sites and content to use bots to retroactively block websites from back in time to when they didn’t even own said content. Half of all things they have archived is inaccessible, blocked by domain squatters who don’t own the old versions of the sites they block. If TechDirt goes bankrupt today, and I buy this domain, I can flag all the old sites in the archive with a simple bot and nobody will be able to view them.
Re: But the Internet Archive sucks balls now
The Internet Archive definitely as it’s problems and should not be depended upon to always being able to keep all its copies up to date as Glyn Moody said or to keep every single copy functional , especially as the quantity of data that it stores continues to rise. However it is far more exhaustive then anything else out there at the moment.
i love the inertnet archive...
said it before, and i’ll say it again:
archive.org is THE most important website on the inertnet tubes…
P.E.R.I.O.D.
(AND it is also GREAT FUN to browse, you can get lost in those ‘stacks’ FOREVER…)
I think we can take it that the UK Government won’t be offering any help…
http://www.theguardian.com/politics/2013/nov/13/conservative-party-archive-speeches-internet
Ummm… the solution for redundant storage and quick access to huge amounts of data exists for years now. It’s called BitTorrent.
Wayback Machine
I’ve always appreciated what the Internet Archive (Way Back Machine) offers. They truly are the only way to go back and read pages from the late 90s. It almost feels prehistoric when you find some GeoCities webpage.