DailyDirt: Categorizing Everything
from the urls-we-dig-up dept
People just keep creating stuff — books, movies, music, you name it… so it’s (more than) a full-time job to keep up with all the cool new stuff. How do we classify music into hip-hop, heavy metal or Krautrock? What can we learn from mapping all these seemingly separate media genres? People and machines are working together to cobble together categorization systems that try to keep up with the flood of new content. Here are just a few examples.
- Netflix hired a bunch of people to try to categorize every movie, and it has 76,897 separate sub-genres for films like “romantic comedies” and “Violent Action Thrillers Starring Bruce Willis”. You’d probably never guess which actor crosses into the most movie sub-genres (it’s not Kevin Bacon)…. It’s Raymond Burr. [url]
- A brief history of music genre origins tells us that categories of music are often named after popular bands or lyrics. For example, bluegrass music is named after “the Blue Grass Boys”… and guess what lyrics inspired “Doo-wop” music? [url]
- Melvil Dewey came up with the Dewey Decimal Classification (DDC) system in 1873. The DDC system is now published by OCLC Online Computer Library Center, Inc. which owns all the copyright rights and manages the licensing of this method of organizing general knowledge. [url]
If you’d like to read more awesome and interesting stuff, check out this unrelated (but not entirely random!) Techdirt post via StumbleUpon.
Filed Under: archives, books, categorization, ddc, dewey decimal, films, genres, libraries, movies, music, oclc, recommendation systems, sky is rising
Companies: netflix
Comments on “DailyDirt: Categorizing Everything”
1873 wtf
how can that still then be copyrighted or patented…no like fuuck you
Re: 1873 wtf
they get away with it because the system relyies on the company defining the dewey code for each new book. it is a system long gone past its use by date, only going because of a hundred years of inertia. libraries should use keywords and abstracts as with any science paper would work quite well with any collection.
Re: Re: 1873 wtf
and that should make it factual data, which should not be eligable for copyright, same as a phone book is not able to be copyrighted.
140 years and still under copyright, where Libraries of all things are being charged licencing fees.
and the maximalists wonder why NO ONE respects copyright.
Re: Re: Re: 1873 wtf
It’s mildly ironic in that libraries are typically filled with open information-type advocates, yet the Dewey Decimal System is “ours”, so the rules don’t apply. Let’s look the other way while we obviously abuse the intent of copyright, shall we?
Re: Re: 1873 wtf
When it comes to a catalog search, all libraries to, in fact, use keywords. We call them subject headings. However, there still needs to be a method for arranging the physical materials on the shelves in a way that people can browse if they want to. That’s were systems like Dewey come into play.
There’s an “open source”, crowd-sourced replacement for the Dewey system that is a branch off the last public domain version of the system in the early 20s. It’s the Open Shelves Classification. (There’s also the Library of Congress classification system, which is PD and which most universities use).
The OLCL has been pretty litigious about its trademarks and copyrights in the past. A few years ago, they sent a C&D to the Library Hotel in NYC, whose floors and rooms are based on the Dewey Decimal system (and whose rooms contain books on particular subjects). After what would now be called the Streisand effect, they backed down, let the hotel give a small amount of money to a charity, and officially licensed the hotel to as a DDCS user.
Are there even any libraries out there still using Dewey for new accessions? My impression is that the vast majority of libraries (in the US at least) are using the Library of Congress system, and that it supplanted Dewey as the system of choice quite some time ago. Most libraries still maintain their old collections under the Dewey system because it’s a PITA to reclassify them to LC (though a number of libraries are doing it anyway). But this also means that unless you’re doing new accessions under Dewey, you don’t really need to run out and update your reference materials each time OCLC releases a new edition.
Re: Re:
Our libraries have been using both Dewey and LC for years now. My assumption had long been that they were aiming to dump Dewey, but it never seems to happen.
Re: Re:
If that’s your impression, I’m wondering when was the last time you were in a public, rather than an academic, library. LCC is a more complex system normally used for very large collections, like university holdings. It’s overkill for places like the one-room neighborhood branch I work at.
Dewey is being updated regularly, including such things as the 30 or so years they spent trying to figure out where books about computers should go, and there’s talk of changing the 200s (religion) so that 80% of the numbers do not solely belong to Christianity.
That said, most librarians would much prefer if OCLC issued the DDC updates under a more open license (and charged less to access the information).