Dear Internet, We Need Better Image Archives
from the The-Public-Domain-should-be-Public dept
Cross-posted from ninapaley.com
Dear Internet,
You know what should be really easy to find online? Good quality, Public Domain vintage illustrations. You know, things like this:
I found this on Flickr, where someone claims full copyright on it. That’s copyfraud, but understandable because Flickr’s default license is full copyright (all the more reason to ignore copyright notices!). But copyfraud isn’t not the main problem. The main problem is that images like this are painfully difficult to find online, especially at high resolutions (and this image is only available at medium resolution – up to 604 pixels high, which is barely usable for most purposes but higher than much of what you find online).
The images are out there – and with zillions of antique books being scanned, their vintage illustrations are being scanned right along with them. But the images are buried in the text, and often the scan quality is poor. Images should be scanned at high quality, and tagged for searchability.
Are archives ignoring the value of images?
Take the American Memory archive of the Library of Congress. Lots and lots of historical documents here, but no way for me to find an image of, say, a horse.
Most book–scanning projects focus on texts, not illustrations. Many interesting and useful illustrations are buried within these scans, uncatalogued and inaccessible. Scan quality is set for text, not illustrations, so even if one can find a choice illustration buried within, its quality is usually too low to use.
Archive.org is great (I love you, archive.org!) but does not have an image archive. Still images are not among their “Media Types” (which consist of Moving Images, Texts, Audio, Software, and Education). So I went spelunking through their texts, starting with “American Libraries,” and searched for something easy: “horse.” Surely I could find a nice usable etching of a horse in there somewhere. I eventually found “The Harness Horse” by Sir Walter Gilbey, from 1898.
Nice illustrations! Can I use them? Unfortunately, no. The book is downloadable as PDF and various e-publication formats, but when I try to extract the illustrations, I get a mess (which you can see, after the jump):
Copied and pasted from Adobe Acrobat. WTF?
The same image, inverted. Doesn't work.
"Save Image as..." from Acobat. This worked, except where it didn't: part of the image is simply missing.
Clearly something is messed up here. Was it just that page? Alas, no:
This sad image from another page has the same problem.
The scans have some flaws that PDFs and Photoshop can't cope with:
Screen grab of zoomed-in view from Acrobat. What looks like a blur in the PDF renders the image unusable when extracted.
These images are not usable, which is a pity because they are very nice illustrations. And they seem to be among the higher quality scans, which again isn't saying much.
Let me add that it's great these books are being scanned at all! That's definitely better than losing them entirely. But as an artist, it saddens me that we're neglecting this wealth of visual art. I'd like to see our rich visual history properly archived. Our bias favoring text over pictures is especially ironic considering how much more efficiently information is communicated to humans through images; "A picture is worth a thousand words," or more. That's why I'm a cartoonist, after all.
I was able to extract one clean image from the book, on page 48:
Unfortunately I can't use this illustration for my purposes, but maybe someone else can. I've already gone through the trouble of finding it in a text, extracting it, and rotating it. If only there were some image archive I could upload it to at high resolution, so someone else could use it. I could tag it, to make it easier to find. I could include all kinds of useful metadata, like what book it was from and when it was published; but even if that was too bothersome, I could at least include tags like "horse," "rider" and "engraving." Wouldn't it be nice if such an archive existed? Wikimedia Commons is close, although I dread uploading things there after having all my open-licensed comics deleted by an overzealous editor. But maybe they're our best hope.
Continuing my searches on archive.org, I found this ostensibly Public Domain, vintage horse book with line illustrations. Unfortunately this is controlled by Google Books. It's "free" to read online in Google's reader, which doesn't allow any image export. It also doesn't allow me to zoom in.
All those illustrations, trapped at low resolution, unusable (even if they were tagged/catalogued, which they aren't). This is our "Public Domain." Who exactly is benefiting from having these 18th Century illustrations inaccessible to today's artists?
Then there's Dover Books. I loved Dover books growing up - they introduced me to the idea of the Public Domain. Dover reproduces vintage illustrations in books for artists and designers. Their paper books were reasonably priced, and you could use the illustrations for anything, without restriction. Browsing was free, so I would flip through the pages in the book store, and if it had what I needed, I'd buy it.
Dover is still selling books, but the prices are now relatively high, few are carried in bookstores, and they prohibit browsing online. You have to shell out $15 to find out if what you need is in the book, and how could you know? They seem to be clinging to an outdated copyright model, and rather than selling things of added value, they are simply blocking access to existing Public Domain works, in order to collect a toll.
What else has kept a good public archive of Public Domain images from existing? Some artists and archivists do make high quality scans of vintage illustrations - and keep them to themselves. I guess we could call this "image hoarding." I assume the reasoning is, "I went through all the trouble to scan it, why should I share? Others can pay me if they want a copy." Also there's the "finders, keepers" reasoning: "anyone else is free to find the same illustration in another antique book, but I found this one, so it's mine." And so these images remain inaccessible, not part of any public archive.
Wikimedia Commons is the best public image archive I know of right now. A bit of searching led me to their "Engravings of Horses" category, which yielded some nice images. Unfortunately, many of these are not available at sufficiently high resolutions.
The maximum size of this image is 800 × 608 pixels, which limits its use. Limited image sizes and limited selection have been the biggest obstacles to my relying more on Wikimedia Commons; but it can get better. Maybe it will. It would be nice if something became the public vintage image archive I and so many other artists need.
Filed Under: archives, images, public domain
Comments on “Dear Internet, We Need Better Image Archives”
I bet if someone could charge monopoly rates for access to high-res images of such public domain works, you wouldn’t have as much trouble finding them.
Just sayin’
Re: Re:
Right then – find me a hi res image from a point in time “just” inside copyright.
Re: Re: Re:
Ok, if you fetch me a cup of coffee first.
Re: Re: Re: Re:
Here you are!
Re: Re: Re:2 Re:
Damn that coffee looks good….
Re: Re:
You could charge monopoly rates for access to them if you want..
Re: Re: Re:
touche
Re: Re:
In that case you would not be charging for the photograph, but the service that you provide to distribute them.
Re: Re:
I bet if someone could charge monopoly rates for access to high-res images of such public domain works, you wouldn’t have as much trouble finding them.
That’s true, but then they would cease to be public domain, which kind of kills the point of the whole thing.
Still, I do see what you’re saying. But what I don’t understand is why your mind goes straight to selling “access” to non-scare goods – the hardest thing to put a price on in the digital era. It’s like you have a mountain full of gold, and instead of mining it you decide to tax everyone in the countryside for looking at it. Why not focus on selling the scarcities? There are a few that seem obvious right off the bat:
– High-res scanning services
– Manual vectorization services
– Archive/library/research services (“I need you to find me high-res engravings of horses, here are my requirements…”)
– Printing/mounting/framing/canvas-transfer/etc (we are talking about a wealth of artwork just waiting to be tapped)
I bet there are more too – including some pretty clever and disruptive ones. But to figure them out you’d have to put your mind to it, instead of relying on outdated laws in the hopes of barely lifting a finger.
Re: Re: Re:
I fail to see how selling any of the services you refer to would result in greater access to high-res copies of public domain works (thought they may very well be a good business for the proprietor).
I’m not trying to make an argument for doing away with the public domain, but I find it amusing that Nina would, without a hint of irony, bemoan how hard it is to get access to good quality copies of things in the public domain.
I think the incentive not just to create, but to distribute, market, publicize, etc. works is one of the most often underappreciated incentives of copyright protection.
Re: Re: Re: Re:
Have you tried Omemo?
Also she can always, always make a torrent file and distribute that, and as long as someone shares it, it will never die.
Re: Re: Re:2 Re:
No. What is Omemo?
Re: Re: Re:3 Re:
http://en.wikipedia.org/wiki/Omemo
A social network storage.
The biggest hard drive on earth.
Re: Re: Re:4 Re:
Ok, that’s cool, but that’s just a tool, like a scanner. The fact that it could, potentially be used to store access to hi-res scans of old images and make them available to the masses doesn’t mean it is being used that way, or will be.
Re: Re: Re:5 Re:
Well, the tool makers made their part, now is up to people to start using those tools 🙂
Besides if people really want to store images they can find solutions.
Make a high resolution video of still images and upload to archive.org, upload to PD.org, upload to the other dozen websites that accept PD material, use Flickr and let search engines index those images, use distributed storage, use distributed websites that are hard to kill by any government unless they can remove all copies from all over the world.
Did you saw the size of the list of places where one can find PD material?
2 years ago you could just count them with your fingers now it has grown to more than a hundred places.
You know what that means right?
Re: Re: Re: Re:
You don’t want others to know that there are free alternatives to your way of incentivizing things, because people are already doing it in a variety of forms.
– Using social images sharing websites like Flickr.
– Torrents.
– Dedicated PD websites, which there are more than one.
– Creating a movie with still images and uploading to archive.org.
– Using distributed storage solutions.
The real problem is not that there are no solutions, the problem is that there are no marketing involved so few people know about it.
But with others bringing attention to the issue that soon may change.
Re: Re: Re:2 Re:
Um…you have no idea what I want. Check your prejudices.
“The real problem is not that there are no solutions, the problem is that there are no marketing involved so few people know about it.”
I agree that’s a problem. That’s one of my points.
Re: Re: Re:3 Re:
I just did.
Re: Re: Re:3 Re:
http://meta.wikimedia.org/wiki/Help:Public_domain_image_resources
About the problem, it appears that it will not be for longer, as sources for free images are popping up everywhere.
If you tried to search that just a few years ago you would find just a dozen or so, now it is a list containing ten times what you would have found before.
There is something driving that.
Re: Re: Re: Re:
I fail to see how selling any of the services you refer to would result in greater access to high-res copies of public domain works (thought they may very well be a good business for the proprietor).
– Scanning and vectorizing: customers pay to have this done so they can be the first to use something that has never been digitized (an illustration they’ve found in a book, for example), but the work is PD so once the deed is done it increases access for everyone. Thus a core group of people who hunt out source material incrementally increase the wealth of quality PD images by funding digitization on an as-needed basis
– Archive/library/research: if there proves to be a demand for such services (and I’m not saying I’m certain, but it seems likely) then naturally that incentivizes whoever offers those services to constantly increase their database of PD works, in order to make curation all the more valuable.
– Printing: again, if there proves to be a demand for printed reproductions of PD artwork, that incentivizes the creation of larger databases of such artwork and increased access to those databases in order to drive sales
Re: Re: Re:2 Re:
Vectorizing etchings is a bad idea. All those lines and vertices add up to big memory-heavy files that crash graphics programs. Simple line art is better vectorized, but if it has a lot of hatching, high res raster images are superior.
Re: Re: Re:3 Re:
Well, I could see vectorizing images having a purpose. I once had inkscape automatically do this one time with an xkcd comic to use as tool paths for the cnc machine. How else am I supposed to read xkcd in plywood?
Re: Re: Re:4 Re:
XKCD doesn’t have a lot of hatching. Old etchings do. It really depends on the type of image. Simpler line art works great as vectors! Most of my own illustrations work better as vector art. But things with hatched shading are much less manageable as vectors than as raster images. Old etching have a lot of lines.
Re: Re: Re:2 Re:
That said, properly scanning and preparing black and white line art does require some skill, so I’d just say “scanning and cleaning up” instead of “scanning and vectorizing.”
Re: Re: Re:3 Re:
Well, actually I meant both scanning and vectorizing as two entirely separate things – as in my original comment 🙂 I do know what you are talking about with vectorized etchings – however I have found that it’s not so bad depending on what level of detail you are talking about, and can still have certain advantages. However, I also wasn’t really limiting my thoughts to this specific type of etching, but public domain artwork in general.
Re: Re: Re:3 Re:
btw, http://www.vintagevectors.com has some cool stuff – but it’s a very small collection, not a big archive. Fun though – and a pretty good guage of what we are talking about, because some of the vectorized etchings are indeed too detailed and hard on the computer, while others are quite nice to work with (though of course this depends to some degree on the computer in question, too)
Re: Re: Re:4 Re:
Thanks for the link to vintagevectors.com. I downloaded this caloric engine illustration. The vector file is 12.6 MB. The high res photoshop file is 11.3 MB. This is a fairly typical etching, and the vector version is larger than the (large) raster. It would be a monster to work with in any of my vector graphics programs; I’d get minutes of the “spinning rainbow” if I tried to edit it. Of course this isn’t true for all images, and I respect vintagevector’s handling of their images – breaking them into smaller chunks, for example, as in the case of these highly detailed border parts.
Re: Re: Re: Re:
Wait, I thought the argument for copyright was that without protection, works could be copied with reckless abandon? Actually, if you even tried to think about it for even a moment, you would realize that copyright has nothing to do with the need to distribute, market, publicize, etc; everyone has those cost, regardless of whether you own the copyright or not. The only difference that copyright makes is the need to pay someone else before doing any of the above.
Re: Re:
No need. We din’t need monopolies for archivers to catalog evey other kind of public domain work, no reason why we can’t do the same for images.
In face, resources are listed in the comments below that do this very thing.
Yet the maximists will still try to claim that this is a failure of the sharing model. *sigh*
Re: Re: Re:
I think sharing PD works is great. I just think that you’re not as likely to get as much effort put forth to such work based on nonprofit goodwill toward man as you would as toward an effort based on cold, hard, greedy, money-grubbing profit.
Hell, you can’t escape popular music. It’s everywhere. I don’t think that’s because if its inherent qualities. I think that’s because people stand to make a lot of money by making such music popular.
Re: Re: Re: Re:
Hmmm…I can escape popular music, movies and other media, I live under a rock.
I don’t know who is hot or not, what is hot on TV or in the theaters, I have to go to torrentbuttler to see what is new everytime I need to be snarky about piracy.
About the odds of having people do something about it, well open source just proved that people can put as much effort as multi billion dollar companies can.
Re: Re: Re: Re:
Here:
http://singularityhub.com/2010/05/10/13-open-source-hardware-companies-making-1-million-or-more-video/
Apparently open source is a growth market and they give all their secrets for free and still manage to make money.
Likely someone will see a potential market in PD and fill in the need for services in that area.
Re: Re: Re:2 Re:
Well, we’ll see. The Internet certainly makes such things easier, but I’m not yet convinced that you’ll get the same level of access, promotion, distribution, quality in the case of old PD works.
Re: Re: Re:3 Re:
http://librivox.org/
http://www.archive.org/
http://www.gutenberg.org/
http://vodo.net/
http://www.jamendo.com/en/
https://www.miroguide.com/
http://www.mininova.org/
http://www.bing.com/search?q=Public+Domain+Images&FORM=QSRE1
http://www.bing.com/search?q=Public+Domain+Library&FORM=QSRE6
http://www.bing.com/search?q=Public+Domain+Music&FORM=QSRE5
http://www.public-domain-image.com/
http://openflix.com/
http://mimiandeunice.com/
http://www.loc.gov/rr/mopic/pubdomain.html
http://www.pdsounds.org/
http://www.choralwiki.org/wiki/
http://publicdomainflix.com/
10 minutes of search, maybe my Google-fu is just better but I don’t see a problem finding anything PD out there.
Finding those is not a problem at all, what is a problem is that they are all over the place, there is no one place that got big and everybody use it to find things yet, but give it time it will happen and then you will see people all over the world sending in a lot of material to it.
Re: Re: Re: Re:
In one phrase:
It is not the content stupid is the service.
Re: Re: Re:2 Re:
My understanding is that Nina is looking for the content.
Re: Re: Re:3 Re:
I thought she said that it was hard to find the content?
That is a job for search engines.
Have you tried typing “public domain” in Google?
it brings up 143 million hits.
Wikimedia has a list of websites as long as my arm.
Re: Re: Re: Re:
Wait… what?
Since when did I talk about popularity?
The thing that matters is that the content is there if you want it and are willing to look, not whether you take the
internet up on that offer or not.
Mass sharing/archiving of PD/open source/CC-BY-(SA) works happens all the time, even though it’s not well advertised on the whole.
Sure I used to ignore free/open source content at first becuase the descriptions or hype was underwhelming, but found myself very happy with such things once I gave it a chance.
The fact I could get what I wanted from freely shared works is a huge success in my book.
Re: Re: Re: Re:
I just think that you’re not as likely to get as much effort put forth to such work based on nonprofit goodwill toward man as you would as toward an effort based on cold, hard, greedy, money-grubbing profit.
Why do you assume profit requires copyright?
Re: Re: Actually there is a reason
Images are one of the old data formats on the web, much older than music or video. If such a repository was going to be created by the good will of people, it would already exist.
But there’s a problem here. Scanning takes work. It’s not like ripping a file or copying it with an automated program. Someone has to pick something up and put it in a box. It’s much harder to share these things.
And don’t be fooled by the vast array of pirated material. There are reasons to believe that companies seed it to increase their revenues from people who pay for access to the pirated material. They’re running stores, they’re just not sharing anything with the artists. Many of these aren’t the grassroots efforts that you would like to believe.
Trust me. Big Piracy is a big business. If they want people to come back each month and paying for access, they’ve got seeders putting up the stuff.
Also don’t be fooled by the existence of open source software. Many of the companies that release software to the open source stacks are doing it for selfish reasons. They share with other programmers because they hope that the other programmers will do some of the work and share the development costs. It often works for some areas.
But don’t think it works for all. There are precious few open source games and it looks like open source productivity software could be heading south now that Sun/Oracle is givin g up on getting anyone to pay.
Face it. The cases when people share successfully are rare. I’m pretty sure that the free scanning services that Mike would like aren’t going to appear any time soon.
Re: Re: Re: Actually there is a reason
https://market.android.com/details?id=com.intsig.camscanner&feature=search_result
OMG is so difficult to scan something today.
Re: Re: Re:2 Actually there is a reason
Did you read the article? He doesn’t want any old picture. He wants someone competent to do a good job and clean up all of the skeezy parts of the image. In other words, he wants a talented artist to work for free.
Re: Re: Re:3 Actually there is a reason
Scan Tailor
http://sourceforge.net/project/screenshots.php?group_id=227253
Scribus
http://www.scribus.net/canvas/Scribus
Is not that hard.
Re: Re: Re:3 Actually there is a reason
And by “he” you mean Nina?
Re: Re: Re: Actually there is a reason
http://www.libreoffice.org/download/
Are you sure Sun/Oracle is important?
Re: Re: Re:2 Actually there is a reason
Take a look at the change logs. Trust me. The work isn’t getting done any more because the workers aren’t getting paid by Sun’s dream of selling hardware with the free software. There’s a reason why Oracle took one look at this market and ran!
Re: Re: Re:3 Actually there is a reason
Sorry, can’t trust you.
Re: Re: Re:3 Actually there is a reason
Take a look at the change logs. Trust me. The work isn’t getting done any more because the workers aren’t getting paid by Sun’s dream of selling hardware with the free software. There’s a reason why Oracle took one look at this market and ran!
Actually, although Sun did a lot to push Oo in the early days – by the end their behaviour had become a problem. When Oracle took over they made things a whole lot worse and caused the community to split.
It was only after the split that they realised that what they had left themselves with (as a result of their own bad actions) wasn’t worthwhile and bailed out.
Now when a movement has a split like that there will be some fallout and it will take time for things to recover. It follows that your observation is not sufficient to support your conclusions – and hence there is no reason to trust you.
Re: Re:
What’s great is that no one is picking up the sarcasm here. They’re just assuming…
Re: Re:
They can. You can do whatever you want with public domain works, thats what makes em’ public domain.
Maximize view on screen. PrtScr. Paste into Photoshop. Crop. Good enough for website or inclusion in small media.
Eh
I Picked up “The Harness Horse – and selected an illuststration at random.
Using Foxit readers snapshot tool I got a pretty good image out of it -(636×451) – none of the problems you encountered.
Re: Eh
636×451 is much too small for many artistic uses.
Re: Re: Eh
Nina,
What would you consider a good resolution? In the case of video, anything above 1920×1080 should be “good enough”, though higher resolutions would allow improved scaling. Perhaps other uses need even higher?
Re: Re: Re: Eh
At least 2400 pixels wide. Artists need to be able to use details in the illustrations.
Re: Re: Re:2 Eh
The one I got according to Gimp have 4368 ? 2912 pixels
But copyfraud isn’t not the main problem.
Aha! So it is the main problem.
You might want to try Adobe Acrobat Professional.
It has an export function which would allow you to export each page (as shown on screen) to a PNG, TIFF, JPG or other formats at various DPI values.
For example, you could export it to a 600dpi PNG file, which would probably give you a 4000 by 2000 image or even larger.
I know I’ve used this to export scanned newspaper articles, to import them in OCR software later on (these programs require high dpi images)
It is an expensive piece of software though, other pdf readers may be able to do the same thing but I can’t vouch for any.
Re: Re:
Sounds like we should start suing Adobe then. They are an instrument that may be used to violate someones copyright.
Re: Re:
You might want to try Adobe Acrobat Professional.
That’s what I used.
lessons learned
i completely agree with you. but i am about to do something that a mentor used to do to me until i figured how to keep my mouth shut.
nina, if you want it that badly, then to do it. maybe you can even figure out a way to make money off it. but don’t expect us to do it.
Google has options to search images by type of licenses and size.
http://www.google.com/support/bin/answer.py?answer=29508
http://images.google.com/search?as_st=y&tbm=isch&hl=en&as_q=horse&as_epq=&as_oq=&as_eq=&as_sitesearch=&cr=&safe=off&btnG=Search+images&tbs=isz:lt,islt:svga&gbv=1&ei=ze6ETtSBN9HtsgavhaHhAQ
For being able to upload and thus archive those findings.
http://pddepot.com/
Found it here
http://meta.wikimedia.org/wiki/Help:Public_domain_image_resources
There are several websites dedicated to PD content apparently for images and some let users upload images.
But there are also other solutions you could also use Flickr and let search engines do the work like this one.
http://www.everystockphoto.com/about.php
It indexes only images with liberal licenses that let you use it or PD, at least that is what it says.
"I'd like to see our rich visual history properly archived."
You need an archivist. But as I pointed out in one of the JSTOR threads, the advocates of “free” here aren’t willing for librarians or archivists to get any income from their services. They’d rather sneak in and “liberate” the data, oblivious to the efforts of scanning and classifying: JSTOR can beg for contributioms. — You should re-consider your stance on “free”, as you’re simply wishing for people to give their time for your possible convenience.
Not intended as overly personal or emphatic. I’m sure you’re a “good” person, in short. But those services don’t come for free.
My solution to the problem of paying archivists and librarians is gov’t subsidies, and the argument concludes that it’d be far better spent than on killing people in needless wars.
Re: "I'd like to see our rich visual history properly archived."
…or you go look here for sources and places to upload free content.
http://meta.wikimedia.org/wiki/Help:Public_domain_image_resources
Re: "I'd like to see our rich visual history properly archived."
the advocates of “free” here aren’t willing for librarians or archivists to get any income from their services.
Actually I’d very much like some of the funding that currently exists for text, audio and motion picture archives to go towards making a PD image library, at least of black and white line illustrations, etchings, engravings and woodcuts. People need to be paid for something like that to work. The Library of Congress pays its staff; most archives have professional staff that are paid. But image archiving isn’t valued the way text archiving is, and so it isn’t funded as well. I assume most funders just don’t think there’s a need for it. I’m pointing out that yes, there is a need. A funded archive could include contributions from unpaid participants as well, but I don’t think a proper image archive is going to happen without some real money.
Re: Re: "I'd like to see our rich visual history properly archived."
I think you’re just being cynical. If the images just manage to Connect with their Fans and give them a real Reason to Buy, the Image Archive is going to be flying high! Don’t be cynical and talk about money. This is the Internet. All of the cool dudes are going to be running to copy this stuff for you because they’re so grateful that they were able to snarf some free MP3s. Yup. That’s how the web rolls all right.
Re: Re: Re: "I'd like to see our rich visual history properly archived."
Fail. Missed the point by a few hundred light years troll.
But yes, I’d be one to donate money for such a service. I’ve helped other archives already. Unfortunately the donations have to fit my earnings and expenditures or I’d give them more =(
Re: "I'd like to see our rich visual history properly archived."
“the advocates of “free” here aren’t willing for librarians or archivists to get any income from their services”
The problem is, when you start out from an assumption that’s unmitigated bullshit, the conclusions you draw from that tend to be bullshit as well…
PDF is the most convoluted, horrible format possible.
Why your extraction results are random.
Don’t get me started on Adobe and its insane format. If there’s one company and format that should be literally outlawed, that’s it.
Re: PDF is the most convoluted, horrible format possible.
If there’s one company and format that should be literally outlawed, that’s it.
Personally I think PowerPoint is worse. And I don’t have any problem with Adobe as a whole. But dangit blue, for once I completely agree with you on something: PDFs are a pain in the ass.
Re: PDF is the most convoluted, horrible format possible.
I agree with OOTB on this one, too. Kumbaya!
Also there is a index on the Gimp yay!
http://gimp-savvy.com/PHOTO-ARCHIVE/
Another one for sharing free photos and images.
http://www.freephotos.se/
A list of websites that offer or let you upload things for free and are in the PD.
http://www.xpase.net/index.php?category=public%20domain
Now if people want distributed storage on the cheap to store petabytes of information maybe things like OMEMO are the solution.
http://www.omemo.com/
It is encrypted and fairly anonymous.
http://en.wikipedia.org/wiki/Omemo
There are other projects that are less known like Osiris Serverless Portal that don’t even depend on DNS on the normal internet.
http://en.wikipedia.org/wiki/Osiris_%28Serverless_Portal_System%29
Microsoft has their own distributed tech too.
http://en.wikipedia.org/wiki/BitVault
Tahoe-LAFS
http://en.wikipedia.org/wiki/Tahoe-LAFS
http://en.wikipedia.org/wiki/GlusterFS
best pd archive on the net:
http://www.artrenewal.org
Extracting images from PDF files.
http://www.cyberciti.biz/faq/easily-extract-images-from-pdf-file/
I have a family bible that’s over 100 years old. It’s nearly a foot thick with over 1500 pages and as many illustrations. It has both the old and new testaments side by side and huge concordance(index) at the back.
The last copyright I can find in it is from 1890. Could I scan the pages and put them into some public domain web site or does copyright still stop me from doing that?
Re: Re:
every image before 1920 are in PD
Re: Re:
I suppose it might depend on what country it was printed in, but almost certainly yes!
Re: Re: Re:
(er, that’s “almost certainly yes” directed to the “can you do it” part – not to the “will copyright stop you” part)
Re: Re:
The last copyright I can find in it is from 1890. Could I scan the pages and put them into some public domain web site or does copyright still stop me from doing that?
I don’t think anything stops you from scanning and displaying such images at all. Any such work, copyrighted in 1890, is now public domain.
Re: Re:
Depends where you live, if you live in the US, yes you can.
http://www.copyright.gov/help/faq/faq-duration.html#duration
Re: Re:
If you need help scanning here are some cheap solutions:
http://www.redferret.net/?p=22459 (10 cheap solutions from building a book scanner to smarphone based scanner apps)
http://www.diybookscanner.org/
Beware that digitizing the book is just part of the process, you need to pass it through a scanning post processing program to fix alignment, vignetting and other issues and probably use an OCR(Optical Character Recognition) software to make it searchable. But just the raw images are mighty fine, I’m sure others can group together to do the rest to get a perfect digital copy off of it, for the post processing part, OCR and proof reading of the OCR output.
Here are 2 places where people help others do it.
http://librivox.org/
http://www.gutenberg.org/
Can you say "death of the commons"?
I can.
So Mike, why don’t you step up and run this? Isn’t that how it’s supposed to work? You’ve got a great idea. You’ve got the vision. So jump on it! I’m happy to support you by using it to illustrate my blog and skip paying real artists. I’ll even fill out some review somewhere that says you’re really great.
Oh what? You need some help doing the work. Don’t worry. I’m sure someone’s going to step forward. There are tons of cool artists and they’re all pissed off at the man and those big corporate machines that only give them a small percentage of what their art is worth. They’ll be rushing over to help you because zero is somehow better than a small percentage.
So stick it to Dover and their outdated business model. Show us how it’s done for free. Maybe you can get Silent Bob to tell you how to create a paywall and collect a toll without calling it a paywall or a toll. Yeah. That’s the ticket. Just change the words.
But whatever you do, blame evil copyright for creating this death of the commons. I know that all of the public domain work is free of copyright and it’s going to take a bit of work to actually blame copyright, but I’m sure that somehow you’ll find a way to blame Rightshaven or the RIAA or the artists who somehow want to cash a check from time to time.
Go for it. Then get to work showing us how it’s done.
Re: Can you say "death of the commons"?
You are afraid aren’t you?
Re: Can you say "death of the commons"?
“skip paying real artists”
yeahh, 150 years old artists
Re: Can you say "death of the commons"?
Why would he pay dead artists whose works are public domain?
Re: Can you say "death of the commons"?
So stick it to Dover and their outdated business model. Show us how it’s done for free.
OK.
I’ve got some Dover CDs. Let’s take the scans, re-title them, and add metadata. The US has no ‘sweat of the brow’ BS, and we’d be making our own compilation, so we should be go to go.
Re: Re: Can you say "death of the commons"?
Hey, I’m a big believer in getting you to do the work for me and give it to me for free. I’ll be sitting here with a gin and tonic cheering you on. Go for it! You’re doing great! We’re all rooting for you!
Re: Re: Re: Can you say "death of the commons"?
Of course you are drunk, when was the last time you saw any drunk do any work in their life?
How to store anything and make it fast to deliver to everybody on earth?
Use Gmail or other email accounts, overlay a virtual filesystem on them and let those servers do the distributing.
http://en.wikipedia.org/wiki/GmailFS
Re: Unfortunately ...
“Unfortunately the GmailFS project has come to an end. libgmail has ceased being maintained by its developers, and as a result libgmail no longer works with the latest Gmail interface (and has not done so for many weeks). Without a working libgmail, GmailFS does not function, so the end of libgmail also spells the end of GmailFS.”
Re: Re: Unfortunately ...
Oh, and “Note that Google’s terms of use prohibit the use of their services by any automated means or any means other than through the interface provided by Google. These restrictions would make use of GmailFS a direct violation of the Service agreement.”
As we know from reading TechDirt, violating any companies terms of service is now a criminal offence … so maybe it’s good that GmailFS doesn’t work any more.
That Harness Horse PDF is bizarre. None of the freeware image extraction programs I have can extract the images correctly. A couple of them won’t even open it. The only program that seems to work on it is “PDF To JPG”
Re: Worked for me ...
See my post farther down.
Re: Re: Worked for me ...
See my post farther down.
You just used screen capture to grab what was on the screen. I was trying to extract the actual data from the file.
seems the complaint is someone else hasnt done the work to make the scans you want, but you cant be bothered to go scan them yourself, quit complaining
dover doesnt have an outdated business model, those people do this thing called ‘work’ to make those PD images useable for others, if they posted them all online so you can “see” them first, people would just copy them and not pay dover for ht etime they invested in making them, but you dont seem to care about that
Re: Re:
http://images.google.com/search?as_st=y&tbm=isch&hl=en&as_q=&as_epq=horse+drawing&as_oq=&as_eq=&as_sitesearch=&cr=&safe=images&btnG=Search+images&tbs=isz:lt,islt:2mp&gbv=1&ei=PhOFTvyuIMLliAL_2N3RDA
Somebody did she just didn’t find it.
Re: Re: Re:
You really don’t know what vintage etchings are, do you?
Re: Re: Re: Re:
Nope, but I do know you didn’t find the original images of that book.
http://ia700406.us.archive.org/6/items/harnesshor00gilb/harnesshor00gilb_orig_jp2.tar
Re: Re: Re: Re:
Come on Nina, surely this one qualifies!
http://fc04.deviantart.net/fs70/f/2010/073/b/1/Horse_Drawing___Aura_by_D3L0RU5.jpg
Re: Re:
Sigh. You just don’t understand. Magic little elves are going to come out each night and scan the documents for Mike. Then magic elves will maintain the database, write the software and make it all pretty so we can all use it for free. They’ll do the work so Mike won’t have to do it.
Re: Re: Re:
It’s fun watching you repeatedly prove that you haven’t actually read the article you’re trolling.
Re: Re: Re: Re:
And it’s fun to see you and your anonymous friends prove you haven’t thought about it.
Re: No, she's been too busy ...
making original art, in high resolution, and putting that into the public domain. Things like the full length animated feature film, “Sita Sings the Blues” (http://www.sitasingstheblues.com/) and the full set of eleven images, “The Avatars of Vishnu” (http://blog.ninapaley.com/2011/05/06/the-avatars-of-vishnu/), in vector format so they have infinite resolution and many many others.
So what have you, Mr. Smug AC, contributed to the Public Domain this year? Huh? What did you say? Nothing! I’m shocked.
Instead, like me, your reading about being creative. (Oh, I’d better get back to my little video editing project.)
Peace,
Rob:-]
Another option get a 3D model that is CC0, pose it and have your horsie.
http://www.blendswap.com/3D-models/characters/alex-2-5-2/
Other 3D models that are PD.
http://www.blendswap.com/?license=cc0
Re: Re:
Another option get a 3D model that is CC0, pose it and have your horsie.
How does that get you a digital version of an old etching? Did you think Nina’s problem is she couldn’t find any pictures of a horse?
Re: Re: Re:
I dunno, using a bump map perhaps?
http://www.herbalcell.com/blog/blender-tutorial-layering-bump-maps
Using a normal map?
http://www.blender.org/development/release-logs/blender-236/normal-maps/
Warping textures?
http://www.blender.org/development/release-logs/blender-236/normals-and-texture/
Acquisition of topology information through 3D scanners, for storing historical data like people doing it for Cuneiform tablets.
http://blog.sculpteo.com/2011/05/25/cornell-university-researchers-replicate-cuneiform-tablets-with-3d-scanning-and-3d-printing/
This last option can be used if you really want to save historical data.
Now I don’t know what she wants the horsie for, it is to learn something by looking at how it was done? it is to recreate that exact same thing for some sort of preservation? or it is to use it as a template for something else?
If it is for use as a template, you don’t need high resolution images to etch anything, you can make one from a 256×256 pixels probably, and create interference patterns on the drawing using modern tools available in Gimp or other image editor, print the result on a laser printer and heat transfer that to any surface you want, using chemical etching techniques get something, and it would be just like etching a PCB, or use sandblasting, or use a 3 axis CNC machine, or care it by hand.
http://en.wikipedia.org/wiki/File:MoireGrid.png
http://splicer.com/2009/02/03/little-slice-big-spiral
You can create any pattern as detailed as you want using modern image editors.
Re: Re: Re: Re:
I don’t know what questions you are trying to answer with this. I also don’t know what Nina wanted to do with the art, but I take her at her word when she says it was hard to find high enough resolution images of some of the things she was looking for. Which, again, is not simply “a picture of a horse”.
I had no problem ...
getting a fairly nice high resolution copy of that horse.
I’m not disagreeing with you about the Public Domain … I do think it’s a travesty that nothing goes into the Public Domain any more … in fact stuff it getting removed.
Here’s a little video of the first way I tried (which I created using the free version of Jing):
http://screencast.com/t/PFQ8Kt1RNd
Next I downloaded the PDF and opened it in Adobe Acrobat Reader v.10.1.1. I went to the page, rotated it 90 degrees and then used Jing again to capture a still image which you can see here:
http://screencast.com/t/G6amW0xG6
Next I futzed around using Adobe Reader to make the picture as big as I could on my screen. Then I used Jing to capture this larger image and saved it to my local disk. I opened it in Gimp and found that the pixel size is 1350×9034. So I zoomed into the picture where you had seen much bluring and took another snap with Jing. I think mine looks much sharper than yours. Here’s the link to it.
http://screencast.com/t/harjCYlxgea1
Of course you are limited to your screen resolution when capturing from your screen. Even 1920×1080 isn’t really high enough for good print design of any size.
Peace,
Rob:-]
p.s. I’ve been enjoying planting your Intellectual Pooperty pamphlets in some strategic place.
Re: I had no problem ...
Very nice. But since the scans were originally captured as image files, wouldn’t it be sensible if they could be obtained as such, rather than being converted back and forth? Can you imagine going through that for every image, when it’s totally unnecessary?
Fortunately, Rick Prelinger left this comment on my blog:
It’s not quite that simple, but close. I replied:
Re: Re: I had no problem ...
The first option that you didn’t list was “read online” if you have choose that one you would be able to save a JPG copy, although smaller than the original.
http://ia600406.us.archive.org/BookReader/BookReaderImages.php?zip=/6/items/harnesshor00gilb/harnesshor00gilb_jp2.zip&file=harnesshor00gilb_jp2/harnesshor00gilb_0006.jp2&scale=2&rotate=90
Because is coming from the reduced images in the ZIP and not the original in the TAR.
Re: Re: I had no problem ...
http://ia600406.us.archive.org/BookReader/BookReaderImages.php?zip=/6/items/harnesshor00gilb/harnesshor00gilb_jp2.zip&file=harnesshor00gilb_jp2/harnesshor00gilb_0006.jp2&scale=1&rotate=90
High resolution image from archive.org(3167 pixels x 1934 pixels)
Going to “read online” then clicking on the zoom button inside the page until it doesn’t zoom any more and saving the image.
Re: Re: Re: I had no problem ...
Note: I changed the “&rotate=0” in the url to “&rotate=90”.
Re: Re: I had no problem ...
Very nice. But since the scans were originally captured as image files, wouldn’t it be sensible if they could be obtained as such, rather than being converted back and forth? Can you imagine going through that for every image, when it’s totally unnecessary?
As I understand it, PDF has its own image compression methods, and all images are converted to one of them when the PDF file is created. I believe that there are both lossy and lossless compression methods. So when the images are extracted from a PDF file, they have to be converted to a normal format anyway. If they were stored in a lossless format and extracted a lossless format, then you get an exact copy of the data. However if they’re extracted to Jpeg format, or were stored in a lossy format, you’ll never be able to get an exact copy of the original file.
Fortunately, Rick Prelinger left this comment on my blog:
That is easier, however it will only work for PDF files obtained from the Internet Archive. Getting a file from anywhere else still leaves you with the problem of how to get the files out of the PDF file.
Bob loves making an idiot of himself.
And he’s your uncle.
Re: Re:
my retarded drunken uncle
artists and techies
More comments than I expected on this article. One thing is clear: TD commenters are not well versed about graphics and how artists use them.
This may explain why there are no really good public image archives online: the leaders of public/open source projects are mostly techies, who (in general) don’t understand images so well. And most visual artists, who do understand images, tend to cling to proprietary models and disdain public archives.
Re: artists and techies
the leaders of public/open source projects are mostly techies, who (in general) don’t understand images so well.
Nor music, unfortunately, which is why Open Source music software has been about ten years behind the times.
Fortunately, things are getting a lot better, very quickly. I’m sure if you got some actual graphic artists on board with this, things would eventually take off.
Perhaps some kind of SETI@home type deal? It’s kind of what Wikimedia Commons does, but there should be a service that is focused mainly on the images themselves.
Or, perhaps, some sort of incentive for book stores (those that are left) to help out? Scan in the drawings from a PD book, and you can have some sort of “sponsorship” ad on the site, or something.
Re: artists and techies
TD commenters are not well versed about graphics and how artists use them.
This may explain why there are no really good public image archives online: the leaders of public/open source projects are mostly techies
The first may be true, but does not explain shit. I bet the largest reason is because there is no money in trying to appease a niche group, who also happens to be ridiculously anal and arrogant about stuff that nobody but the niche group notices or cares about. It seems more likely that
And most visual artists, who do understand images, tend to cling to proprietary models and disdain public archives.
explains that they are far less technical (meaning: cant figure out how to load an image and right click), than the technical people are visual artists.
Re: artists and techies
Who needs images? A text-only VT100 terminal is all one would ever need.
😉
Re: artists and techies
I think this is the heart of the problem. Artists can’t wait for other people to give them what they want. The great thing about the internet is if you see a need for something, you can fill that need yourself. You probably have the clout to start such an image archive and get a group of people going filling it with quality scans.
It would be a vintage clip art collection for artists and designer. I would love to have such an archive for design work too.
I tried the Smithsonian – not much luck with high quality images there either.
Re: artists and techies
Nice comments there Nina. Why not just call them fucking morons?
The truth is people who live a leaching life rarely learn the tools to actually make anything for themselves. You have made it all the way up to bad cartoons, which puts you in the top 1 or 2 percent on this site, considering most people here (like the talentless schmuck Marcus Carab) thing that taking something and chanting bad poetry over it is somehow “art”.
Don’t be lazy – if there is a need, make it your life’s work. Give up your time and really give back to the Tardian world. Stop all this other stuff you are wasting you time on, and give back to the community that has so well rewarded you by ignoring all your previous works.
Re: Re: artists and techies
Funny, I don’t think I saw you post your resume at any point…
How about you do and then we can contrast your work with Nina and Marcus’?
Re: Re: Re: artists and techies
In his mind, it is better to not create anything at all than to risk creating something that isn’t 100% “original” or something that a random anonymous weirdo on the internet might (gasp!) make fun of you for.
We should cut him some slack though – when the human fire in your belly is all but extinguished by bile and uncle-sperm, the world must seem like a cruel ironic place: so many people walking around, mocking you, making it look so easy to just be happy and not have dicks in their mouths while you struggle with the mystery of how that is accomplished. It must be a sad little life in his basement, with nothing to keep him company but porn blogs and photos of Mike with the eyes scratched out – he deserves our pity more than anything.
Re: Re: artists and techies
Did you not see sita sings the blues? it was awesome and mostly done by just Nina.
In fact, I think your “hate” of nina’s work is purely political. You just want to belive that anyone who disagrees with you is a lazy leech even though the facts don’t bear that out.
Archive.org has the original images for all their books apparently, this one uses JPEG 2000 compression which may be the cause for the fusing on the PDF format.
http://ia700406.us.archive.org/6/items/harnesshor00gilb/harnesshor00gilb_orig_jp2.tar
Using openjpeg to extract the image to TGA I got a file with 30MB in size that is clear to me here
http://code.google.com/p/openjpeg/wiki/DocJ2KCodec
j2k_to_image -i /home/thepirate/Documents/harnesshor00gilb_orig_0006.jp2 -o /home/thepirate/Documents/harnesshor00gilb_orig_0006.tga
http://en.wikisource.org/wiki/Help:DjVu_files
Scan Tailor
http://sourceforge.net/project/screenshots.php?group_id=227253
Open source post processor for scanned images 🙂
Nina, quit being a lazy git. You your time and your efforts to make better images, and put them in the public domain. Make it your life’s work.
PLEASE?
Re: Re:
I am certain that you under the microscope would have nothing original under your belt.
Nina, use screenshots to get the image from pdfs.
Re: Re:
whats on the screen its a projection of the image not the image.
“I assume the reasoning is, “I went through all the trouble to scan it, why should I share? Others can pay me if they want a copy.”
That reasoning is common in the real world, where the rest of us live. If someone takes the trouble to do painstaking work and expects to be compensated, why is that a bad thing? Google scanned a great deal of books and put them up online, because they have the effing money, and stand to make more money from their efforts.
Instead of whining about the lack of images, why don’t you do something about it? Start a project. Who knows, Google might acquire it.
“Anyone else is free to find the same illustration in another antique book, but I found this one, so it’s mine.” And so these images remain inaccessible, not part of any public archive.”
Because no one wants to work for free. At least in this area. Maybe someone will read this post and take the cue, pumping in a great deal of money, energy and enthusiasm to create a wonderful free online archive that everyone has instant access to.
+100000000
When we rebuilt Encyclopedia Dramatica we got a huge number of the articles from Archive.org (We love you too!), but we lost sooooo many images. We are still missing something like 60 thousand image files, and many more of the files that we have are only thumbnails we pulled from Google cache snapshops of ED articles. The Internet needs an Image archive database. I guess it comes down to who would pay for it. I would donate whatever server space I could just to make sure that something like this never happens again, and I know many others would do the same.
Re: +100000000
Wait, are you the one who revived ED on the .ch domain?
All hail the troll god!
*Places an offering of LULZ on alter*
Here it is
The Internet Archive allows you to export directly via the web interface, just right click on the image, copy the URL, then edit it to change the rotation and scale:
http://ia600406.us.archive.org/BookReader/BookReaderImages.php?zip=/6/items/harnesshor00gilb/harnesshor00gilb_jp2.zip&file=harnesshor00gilb_jp2/harnesshor00gilb_0006.jp2&scale=1&rotate=90
You can capture the picture from the original source.
the URL of the picture shown is: https://ia800207.us.archive.org/BookReader/BookReaderImages.php?zip=/2/items/harnesshor00gilb/harnesshor00gilb_jp2.zip&file=harnesshor00gilb_jp2/harnesshor00gilb_0006.jp2&scale=1&rotate=90
Unfortunately, the process of getting to it is cumbersome, and they are saved in JP2 format, which apparently no one can access except through their site one page at a time.
If it isn’t obvious by now, many corporations see value in public domain works, so they will do everything to make sure their version is far superior to what’s available online. It’s text and monochromatic images. Most of it would be top quality at 600dpi 1 bit lossless TIFF/PNG/ETC at 50-200kb/page. But no, they make sure lossy compression is used and that even the GB sized archive.org files are useless for getting better quality.
Anyone can do better quality with scanner or even camera at home. They do keep the high quality images and serve low quality that also includes their own ads in every page. How much would this kind of advertising cost in the real world? Millions and millions. One impression per page for each book?
They scan the images for their own use, get adverstising (which pays a lot more than their scanning cost) and win in all possible ways.