Content Moderation Case Study: Automated Copyright Takedown Bot Goes Haywire (2018)
from the take-it-all-down dept
Summary: For years, Google and YouTube have included a trusted flagger program by which certain entities that have shown they ?are particularly effective at notifying YouTube? of content violations are given more powerful tools with which to do so.
This is used often in the copyright context, and companies with a good history may be given access to things like bulk flagging tools and priority review of flagged content. One such trusted flagger for copyright was a company called Topple Track, which offered an automated service for musicians, searching the internet for infringing works and dashing off automated DMCA notices.
In May of 2015, digital music distribution company Symphonic purchased Topple Track, but appeared to keep the service running under its own brand.
In the summer of 2018, some people noticed that Topple Track?s automated DMCA notices appeared to go a bit haywire, sending DMCA notices for all kinds of perfectly legitimate content. Among those targeted with DMCA notices were the Electronic Frontier Foundation (EFF), the American Bar Association, NYU?s Law Review, the Crunchbase article about the company MP3Tunes and many, many more — including many artists? own web stores. EFF?s summary of the wild takedowns gives a sample
Among others, these notices improperly target:
- EFF?s case page about EMI v MP3Tunes
- The authorized music store on the official homepage of both Beyonce and Bruno Mars
- A fundraising page on the Minneapolis Foundation?s website
- The Graceland page at Paul Simon?s official website
- A blog post by Professor Eric Goldman about the EMI v MP3Tunes case
- A Citizen Lab report about UC Browser
- A New Yorker article about nationalism and patriotic songs
Other targets include an article about the DMCA in the NYU Law Review, an NBC News article about anti-virus scams, a Variety article about the Drake-Pusha T feud, and the lyrics to ?Happier? at Ed Sheeran?s official website. It goes on and on.
EFF published an article about this and noted that it seemed as yet another example of an automated DMCA reporting bot ?running amok.? The group also questioned why such a company was in Google?s ?trusted flagger? program.
Decisions to be made by Google / YouTube:
-
What qualifications are there for a partner to be considered a ?trusted flagger??
-
How often are trusted flaggers reviewed to make sure they still belong in the program?
-
What does it take to get a trusted flagger removed from the program?
Questions and policy implications to consider:
-
With more emphasis on the speed of removals, it is often tempting for regulators to promote ?trusted flagging? or ?priority? accounts that are able to get content removed at a much quicker pace. What are the benefits and risks of such programs?
-
Automated flagging and now AI/Machine Learning flagging are increasingly a part of the content moderation landscape. How are they calibrated? How frequently are they reviewed?
-
What should the response be when an automated bot is flagging many accounts mistakenly?
Resolution: After the EFF published its article about Topple Track, the parent company Symphonic Distribution apologized to the organization, saying: ?bugs within the system that resulted in many whitelisted domains receiving these notices unintentionally.? As EFF pointed out in response, this seemed difficult to believe, seeing as the problem was not simply a mistake in domains that shouldn?t have been scanned, but simply claiming stuff that had nothing to do with the underlying copyright-covered material.
A few weeks after the article, YouTube also told EFF that Topple Track had been removed from its Trusted Flagger program, ?due to a pattern of problematic notices.? Some time after this, Topple Track, as a unique organization appeared to disappear, and the service and technology have apparently been subsumed into Symphonic Distribution?s catalog of services.
Originally published on the Trust & Safety Foundation website.
Filed Under: automated takedowns, content moderation, copyright, dmca, trusted flagger
Companies: google, symphonic, topple track, youtube
Comments on “Content Moderation Case Study: Automated Copyright Takedown Bot Goes Haywire (2018)”
How to write a bot
One day, when I sit down to write a bot that automatically sends out legal notices, I am going to make sure to include an "if" statement that checks to see if the recipient is the Electronic Frontier Foundation. I would have thought that would be obvious.
Re: How to write a bot
The estate of Janet Jackson and the lawyers for Bread, and for Davido are lining up to send DMCA notices to this blog because you have referred to the title of their songs.
… you might want to be skeptical of the ones saying they representing Jackson’s estate, they might be a little over-eager…
'You're a known liar, so...'
For years, Google and YouTube have included a trusted flagger program by which certain entities that have shown they “are particularly effective at notifying YouTube” of content violations are given more powerful tools with which to do so.
What’s in dire need is a program, or even better a change to the law to ever so slightly balance it out, that allows companies to create and use a list of untrustworthy DMCA filers. If someone sends out bogus DMCA claims left and right there should be a penalty for that, even if it’s only having any of their claims assumed to be bogus until proven otherwise or given the absolute lowest priority in resolving, as without a penalty there’s no incentive for accuracy, something which is demonstrated time and time again.
Google is not hosting the alleged infringing content… still not sure how they are a valid destination for a DMCA takedown notice.
Re: Re:
well if you ask hollywood, google IS "the internet" and really should give at least 75% of its revenues to hollywood film studios, for "reasons" that it would be anti-american to investigate….
Re: Re:
They just want the search results removed in that case. Which is pretty stupid if they don’t send notice to whomever is actually hosting the putatively infringing content. Par for the course, either way.
Re: Re: Re:
Well many of the sites aren’t in the US and well US law isn’t global.
A DMCA notice is meant to be sent to the hoster of the infringing content, Google does not host the content.
They are abusing the law by demanding a 3rd party with no connection take action to protect their content when the law tells them how to handle it but that’s to hard.
It's just software, how hard can it be?
Machine Learning is a nightmare. There isn’t any review, the test cases are never complete enough. Indeed, they actually cannot be by definition.
Now, add in some oddly connected corporate rat bait and they just tweak their software a little bit. You know, to get a few more hits. It is very easy to do with ML. It is nearly impossible to do it correct without a well defined implementation plan.
We don’t need no stinkin’ plan, said every Corpo Rat ever.
This comment has been flagged by the community. Click here to show it.
32 attempts to get in, Maz! You won’t stop me hooting about copyright enforcement. Especially when it goes as well as DOES HERE, and makes you so mad because you’re a pirate! Hoot! Hoot! Goofy hooty hootitty HOOT! Ah iku iku kudasai dameeee!
Re: Re:
are you insane
Re: Re: Re:
Nope I’m the resident crazy, they give me pills and everything.
Re: Re:
Yeah, yeah, you’re one of the three trolls here who love censorship along with Koby and Restless.
Re: Re:
out_of_the_blue just hates it when due process is enforced.
I’m remind of the time when techdirt covered some people trying to scam their way into being able to abuse ContentID.
Any entity sending automated notices should be considered untrustworthy by default.