DailyDirt: Computers Are Editing Our Double-Plus-Ungood Content

from the urls-we-dig-up dept

More and more digital media is being edited and prioritized in datacenters by intangible algorithms. As usual, this can be good and bad, depending on how the technology is used. On the one hand, algorithms can do laborious tasks that humans don’t want to do. But at the same time, algorithms might introduce all kinds of errors or inadvertent biases on a scale that no group of humans could ever accomplish without automation. Here are just a few links on bots tinkering with online content.

If you’d like to read more awesome and interesting stuff, check out this unrelated (but not entirely random!) Techdirt post via StumbleUpon.

Filed Under: , , , , , , , , , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “DailyDirt: Computers Are Editing Our Double-Plus-Ungood Content”

Subscribe: RSS Leave a comment
6 Comments
zip says:

vandalism-by-bot on Wikipedia

“the biggest bot job on Wikipedia is detecting vandalism.”

And the second-biggest bot “job” on Wikipedia is perpetrating vandalism.

I’ve seen much content, often high-quality content, deleted by bots. It seems to follow a familiar pattern: a new person -unregistered- comes in and writes a substantial addition to an article, but due to a minor violation of some arbitrary rule, everything the person wrote is automatically deleted. (and often times, new writers don’t return to “defend” their edits)

Here’s just one example of vandalism-by-wikibot that caught my attention:

https://en.wikipedia.org/w/index.php?title=Skiptrace&diff=302349485&oldid=302349312

Anonymous Coward says:

Re: vandalism-by-bot on Wikipedia

imho in that case that’s not vandalism-by-bot but correct bucket-of-spam links removal.

When doing edits on different sections as an unregistered user it’s common courtesy to only change a single section, not to change various ones in a single edit doing major revisions and dump a bucketload of URL links at the end of the article. This behavior is extremely common for url link spammers and that caused the revert action.

Adding the URLs usually belongs in a separate edit action… check the Talk page of that ip address, the reason for the revert is stated clearly there: url link dump.

zip says:

Re: Re: vandalism-by-bot on Wikipedia

“imho in that case that’s not vandalism-by-bot but correct bucket-of-spam links removal. When doing edits on different sections as an unregistered user it’s common courtesy to only change a single section, not to change various ones in a single edit doing major revisions and dump a bucketload of URL links at the end of the article. This behavior is extremely common for url link spammers and that caused the revert action. Adding the URLs usually belongs in a separate edit action… check the Talk page of that ip address, the reason for the revert is stated clearly there: url link dump.”

Bots don’t argue “reasons” — they spit out canned responses (and enforce non-negotiable blanket rules) when triggered. In this case based on the inclusion of a single word, “myspace”.

This was the offending line that nuked everything:

MySpace (http://www.myspace.com)- a “self-promotion” site where people often provide substantial details about themselves

So because of a single line containing the URL of the home page of a highly-popular site on the ‘ban’ list, the entire body of work by that author was thrown out. Although well-intentioned, the bot ‘crashed-and-burned’ here because the bot’s programmer failed to distinguish between links to personal pages on MySpace (of which there are millions) and the front page of MySpace. The bot was obviously programmed to assume that any links to MySpace (homepage or not) were put there by a spammer trying to googlebomb his personal vanity page to increase its search-engine ranking. As judge, jury and executioner, the bot pronounced that Wikipedia editor guilty of link-spamming, and as punishment, deleted not just the offending word, but all edits ever made by that person (even those that broke no “rules”) going all the way back to his first appearance on Wikipedia.

That’s severe overkill, based on an invalid assumption, triggered by the bot’s slopily-programmed ruleset. And as a result, the Wikipedia bot vandalized –in this case permanently– an entire two-thirds of an article about an informative subject.

But just think about it for a moment … the Wikipedia article “Skiptrace” is about research methods used to locate people. Doesn’t it seem counterproductive that Wikipedia’s search-and-destroy bots would (mis)identify the URLs of these related search engines and online research tools used by investigators for data mining — including the website of the US Post Office — and consider them all to be “link spam” – even when they are precisely on-topic and relevant to the subject?

I find it amusing that “Anonymous Coward” would find the Wikipedia bot’s draconian enforcement action to be justified because a new user was not aware of the various customs peculiar to Wikipedia. I think this is one of the main problems with Wikipedia — the site has become very unfriendly and unforgiving to new visitors, who are somehow expected to know a long list of esoteric rules before they ever start. Rules that are often counter-intuitive and illogical to an outsider not steeped in the “culture” of Wikipedia.

Anonymous Coward says:

Re: Re: Re: vandalism-by-bot on Wikipedia

It’s a very poor quality edit that is correctly being reversed by the bot.

The biggest problem with the edit is actually that that mass of material is unsourced original research; it shouldn’t stay even if it’s “defended” (unless the defense is adequate sourcing, which seems unlikely).

The biggest problem with the bot program is that well-intentioned edits such as this one get reversed without a whole lot of deep clarification. This causes potential new editors to get alienated early. However, even with major efforts at mentoring, it’s really rare to convert someone with a “link dump” mentality, such as that evinced by the reverted edit, into a good editor. They tend to have preconceived notions about how Wikipedia should be that are at odds with the views of most other editors.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...