's Techdirt Profile


Posted on Techdirt - 5 January 2022 @ 03:50pm

Content Moderation Case Study: Roblox Moderators Combat In-Game Reenactments Of Mass Shootings (2021)

Online game platform Roblox has gone from a niche offering to a cultural phenomenon over its 15 years of existence. Rivalling Minecraft in its ability to attract young users, Roblox is played by over half of American children and has a user base of 164 million active users.

Roblox also gives players access to a robust set of creation tools, allowing users to create and craft their own experiences, as well as enjoy those created by others. 

A surge in users during the COVID-19 pandemic created problems Roblox’s automated moderation systems — as well as its human moderators — are still attempting to solve. Roblox employs 1,600 human moderators who not only handle content flowing through in-game chat features but content created and shared with other users utilizing Roblox’s creation tools. 

Users embraced the creation tools, some in healthier ways than others. If it happened in the real world, someone will try to approximate it online. Users have used a kid-focused game to create virtual red light districts where players can gather to engage in simulated sex with other players — an activity that tends to avoid moderation by utilizing out-of-game chat platforms like Discord to provide direct links to this content. 

Perhaps more disturbingly, players are recreating mass shootings — many of them containing a racial element — inside the game, and inviting players to step into the shoes of mass murderers. Anti-Defamation League researcher Daniel Kelley was easily able to find recreations of the Christchurch Mosque shooting that occurred in New Zealand in 2019

While Roblox proactively polices the platform for “terrorist content,” the continual resurfacing of content like this remains a problem without an immediate solution. As Russell Brandom of The Verge points out, 40 million daily users generate more content than can be manually-reviewed by human moderators. And the use of a keyword blocklist would result in users being unable to discuss (or recreate) the New Zealand town. 

Company considerations:

  • How does catering to a younger user base affect moderation efforts?
  • What steps can be taken to limit access to or creation of content when users utilize communication channels the company cannot directly monitor? 
  • What measures can be put in place to limit unintentional interaction with potentially harmful content by younger users? What tools can be used to curate content to provide “safer” areas for younger users to explore and interact with?

Issue considerations:

  • How should companies respond to users who wish to discuss or otherwise interact with each other with content that involves newsworthy, but violent, events? 
  • How much can a more robust reporting process ease the load on human and AI moderation?
  • Can direct monitoring of users and their interactions create additional legal risks when most users are minors? How can companies whose user bases are mostly children address potential legal risks while still giving users freedom to create and communicate on the platform?


Roblox updated its Community Standards to let users know this sort of content was prohibited. It also said it would engage in “proactive detection” that would put human eyes on content related to terms like this, allowing geographic references but not depictions of the mosque shooting. 

Originally posted to the Trust and Safety Foundation website.

Posted on Techdirt - 15 December 2021 @ 03:59pm

Content Moderation Case Study: Nintendo Blocks Players From Discussing COVID, Other Subjects (2020)

Summary: Nintendo has long striven to be the most family-friendly of game consoles. Its user base tends to skew younger and its attempts to ensure its offerings are welcoming and non-offensive have produced a long string of moderation decisions that have mostly, to this point, only affected game content. Many of these changes were made to make games less offensive to users outside of Nintendo’s native Japan. 

Nintendo’s most infamous content moderation involved a port of the fighting game Mortal Kombat. While other Sega (Nintendo’s main rival at that point) console owners were treated to the original red blood found in the arcades, Nintendo users had to make do with a gray colored “sweat” — a moderation move that greatly cemented Nintendo’s reputation as a console for kids. 

Nintendo still has final say on content that can be included in its self-produced products, leading to contributors finding their additions have been stripped out of games if Nintendo’s moderators feel they are possibly offensive. While Nintendo has backed off from demanding too many alterations from third-party game developers, it still wields a heavy hand when it comes to keeping its own titles clean and family-friendly.

With the shift to online gaming, came new moderation challenges for Nintendo to address. Multiple players interacting in shared spaces controlled by the company produced some friction between what players wanted to do and what the company would allow. The first challenges arrived nearly a decade ago with the Wii, which featured online spaces where players could interact with each other using text or voice messages. This was all handled by moderators who apparently reviewed content three times before allowing it to arrive at its destination, something that could result in an “acceptable” thirty minute delay between the message’s sending and its arrival.

Thirty minutes is no longer an acceptable delay, considering the instantaneous communications allowed by other consoles. And there are more players online than ever, thanks to popular titles like Animal Crossing, a game with social aspects that are a large part of its appeal

While it’s expected Nintendo would shut down offensive and sexual language, given its perception of the desire of its target market, the company’s desire to steer users clear of controversial subjects extended to a worldwide pandemic and the Black Lives Matter movement in the United States.

Here’s what gaming site Polygon discovered after Nintendo issued a patch for Animal Crossing in September 2020:

According to Nintendo modder and tinkerer OatmealDomeVer. 10.2.0 expands the number of banned words on the platform, including terms such as KKK, slave, nazi, and ACAB. The ban list also includes terms such as coronavirus and COVID. Polygon tested these words out while making a new user on a Nintendo Switch lite and found that while they resulted in a warning message, the acronym BLM was allowed by the system. Most of these words seem to be a response to the current political moment in America.

Patricia Hernandez, Polygon

As this report from the Electronic Frontier Foundation notes, Nintendo often steers clear of political issues, even going so far as to ban the use of any of its online games for “political advocacy,” which resulted in the Prime Minister of Japan having to cancel a planned Animal Crossing in-game campaign event.

Company considerations:

  • How does limiting discussion of current/controversial events improve user experience? How does it adversely affect players seeking to interact?
  • How should companies respond to users who find creative ways to circumvent keyword blocking? 
  • How does a company decide which issues/terms should be blocked/muted when it comes to current events?

Issue considerations:

  • How should companies approach controversial issues that are of interest to some players, but may make other players uncomfortable? 
  • How can suppressing speech involving controversial topics adversely affect companies and their user bases?
  • How can Nintendo avoid being used by governments to control speech related to local controversies, given its willingness to preemptively moderate speech related to issues of great interest to its user base?

Resolution: Nintendo continues its blocking of these terms, apparently hoping to steer clear of controversial issues. While this may be at odds with what players expect to be able to discuss with their online friends, it remains Nintendo’s playground where it gets to set the rules.

But, as the EFF discovered, moderation could be easily avoided by using variations that had yet to end up on Nintendo’s keyword blocklist.

Originally posted to the Trust and Safety Foundation website.

Posted on Techdirt - 8 December 2021 @ 03:45pm

Content Moderation Case Study: Twitter Briefly Restricts Account Of Writer Reporting From The West Bank (2021)

Summary: In early May 2021, writer and researcher Mariam Barghouti was reporting from the West Bank on escalating conflicts between Israeli forces and Palestinian protestors, and making frequent social media posts about her experiences and the events she witnessed. Amidst a series of tweets from the scene of a protest, shortly after one in which she stated “I feel like I’m in a war zone,” Barghouti’s account was temporarily restricted by Twitter. She was unable to post new tweets, and her bio and several of her recent tweets were replaced with a notice stating that the account was “temporarily unavailable because it violates the Twitter Media Policy”.

The incident was highlighted by other writers, some of whom noted that the nature of the restriction seemed unusual, and the incident quickly gained widespread attention. Fellow writer and researcher Joey Ayoub tweeted that Barghouti had told him the restriction would last for 12 hours according to Twitter, and expressed concern for her safety without access to a primary communication channel in a dangerous situation.

The restriction was lifted roughly an hour later. Twitter told Barghouti (and later re-stated to VICE’s Motherboard) that the enforcement action was a “mistake” and that there was “no violation” of the social media platform’s policies. Motherboard also asked Twitter to clarify which specific policies were initially believed to have been violated, but says the company “repeatedly refused”.

Company Considerations:

  • In cases where enforcement actions are taken involving sensitive news reporting content, how can the reasons for enforcement be better communicated to both the public and the reporters themselves?
  • How can the platform identify cases like these and apply additional scrutiny to prevent erroneous enforcement actions?
  • What alternatives to account suspensions and the removal of content could be employed to reduce the impact of errors?
  • How can enforcement actions be applied with consideration for journalists’ safety in situations involving the live reporting of dangerous events?

Issue Considerations:

  • With so much important news content, especially live reporting, flowing through social media platforms, what can be done to prevent policy enforcement (erroneous or otherwise) from unduly impacting the flow of vital information?
  • Since high-profile enforcement and reversal decisions by platforms are often influenced by widespread public attention and pressure, how can less prominent reporters and other content creators protect themselves?

Resolution: Though the account restriction was quickly reversed by Twitter, many observers did not accept the company’s explanation that it was an error, instead saying the incident was part of a broader pattern of social media platforms censoring Palestinians. Barghouti said:

“I think if I was not someone with visibility on social media, that this would not have garnered the attention it did. The issue isn’t the suspension of my account, rather the consideration that Palestinian accounts have been censored generally but especially these past few weeks as we try to document Israeli aggressions on the ground.”

Posted on Techdirt - 1 December 2021 @ 03:40pm

Content Moderation Case Study: Discord Adds AI Moderation To Help Fight Abusive Content (2021)

Summary: In the six years since Discord debuted its chat platform, it has seen explosive growth. And, over the past half-decade, Discord’s chat options have expanded to include GIFs, video, audio, and streaming. With this growth and these expanded offerings, there have come a number of new moderation challenges and required adapting to changing scenarios.

Discord remains largely text-based, but even when limited to its original offering — targeted text-oriented forums/chat channels — users were still subjected to various forms of abuse. And, because the platform hosted multiple users on single channels, users sometimes found themselves targeted en masse by trolls and other malcontents. While Discord often relies on the admins of servers to handle moderation on those servers directly, the company has found that it needs to take a more hands on approach to handling content moderation.

Discord’s addition of multiple forms of content create a host of new content moderation challenges. While it remained text-based, Discord was able to handle moderation using a blend of AI and human moderators.

Some of the moderation load was handed over to users, who could perform their own administration to keep their channels free of content they didn’t like. For everything else (meaning content that violates Discord’s guidelines), the platform offered a mixture of human and AI moderation. The platform’s Trust & Safety team handled content created by hundreds of millions of users, but its continued growth and expanded offerings forced the company to find a solution that could scale to meet future demands.

To continue to scale, Discord ended up purchasing Sentropy, an AI company that only launched last year with the goal of building AI tools to help companies moderate disruptive behavior on their platforms. Just a few months prior to the purchase, Sentropy had launched its first consumer-facing product, an AI-based tool for Twitter users to help them weed out and block potentially abusive tweets. However, after being purchased, Sentropy shut down the tool, and is now focused on building out its AI content moderation tools for Discord.

Discord definitely has moderation issues it needs to solve — which range from seemingly-omnipresent spammers to interloping Redditors with a taste for tasteless memes — but it remains to be seen whether the addition of another layer of AI will make moderation manageable.

Company Considerations:

  • What advantages can outside services offer above what platforms can develop on their own? 
  • What are the disadvantages of partnering with a company whose product was not designed to handle a platform’s specific moderation concerns?
  • How do outside acquisitions undermine ongoing moderation efforts? Conversely, how do they increase the effectiveness of ongoing efforts? 
  • How should platforms handle outside integration of AI moderation as it applies to user-based moderation efforts by admins running their own Discord servers?
  • How much input should admins have in future moderation efforts? How should admins deal with moderation calls made by AI acquisitions that may impede efforts already being made by mods on their own servers?

Issue Considerations:

  • What are the foreseeable negative effects of acquiring content moderation AI designed to handle problems observed on different social media platforms?
  • What problems can outside acquisitions introduce into the moderation platform? What can be done to mitigate these problems during integration?
  • What negative effect can additional AI moderation efforts have on “self-governance” by admins entrusted with content moderation by Discord prior to acquisition of outside AI?

Resolution: So far, the acquisition has yet to produce much controversy. Indeed, Discord as a whole has managed to avoid many of the moderation pitfalls that have plagued other platforms of its size. Its most notorious action to date was its takeover of the WallStreetBets server as it went supernova during a week or two of attention-getting stock market activity. An initial ban was rescinded once the server’s own moderators began removing content that violated Discord guidelines, accompanied by Discord’s own moderators who stepped in to handle an unprecedented influx of users while WallStreetBets continued to make headlines around the nation.

Other than that, the most notable moderation efforts were made by server admins, rather than Discord itself, utilizing their own rules which (at least in one case) exceeded the restrictions on content delineated in Discord’s terms of use.

Originally posted to the Trust & Safety Foundation website.

Posted on Techdirt - 24 November 2021 @ 03:31pm

Content Moderation Case Studies: Facebook Suspends Account For Showing Topless Aboriginal Women (2016)

Summary: Facebook’s challenges of dealing with content moderation around “nudity” have been covered many times, but part of the reason the discussion comes up so often is that there are so many scenarios to consider that it is difficult to create policies that cover them all.

In March of 2016, activist Celeste Liddle gave the keynote address at the Queen Victoria Women’s Centre’s annual International Women’s Day address. The speech covered many aspects of the challenges facing aboriginal women in Australia, and mentions in passing at one point that Liddle’s Facebook account had been repeatedly suspended for posting images of topless aboriginal women that were shown in a trailer for a TV show.

“I don’t know if people remember, but last year the Indigenous comedy show 8MMM was released on ABC. I was very much looking forward to this show, particularly since it was based in Alice and therefore I knew quite a few people involved.

“Yet there was controversy because when 8MMM released a promotional trailer for the show prior to it going to air. This trailer was banned by Facebook because it featured topless desert women painted up for ceremony engaging in traditional dance.

“Facebook saw these topless women as “indecent” and in violation of their no nudity clause. On hearing this, I was outraged that Arrernte woman undertaking ceremony could ever be seen in this way so I posted the trailer up on my own page stating as such.

“What I didn’t count on was a group of narrow-minded little white men deciding to troll my page so each time I posted it, I not only got reported by them but I also got locked out and the video got removed.” — Celeste Liddle

The publication New Matilda published a transcript of the entire speech, which Liddle then linked to herself, leading her account to be suspended for 24 hours and New Matilda’s post being removed — highlighting the point that Liddle was making. As she told New Matilda in a follow up article about the removal and the suspension:

“My ban is because I’ve previously published images of nudity… I’m apparently a ‘repeat nudity poster offender’…

“I feel decidedly smug this morning, because everything I spoke about in my speech on this particular topic just seems to have been proven completely true…

“It’s actually a highly amusing outcome.” — Celeste Liddle

Facebook’s notice to New Matilda claimed that it was restricted for posting “nudity” and said that the policy has an exception if the content is posted for “educational, humorous or satirical purposes,” but did not give New Matilda a way to argue that the usage in the article was “educational.”

Many publications, starting with New Matilda, highlighted the contrast that the same day Liddle gave her speech (International Women’s Day), Esquire released a cover story about Kim Kardashian which featured an image of her naked but partially painted. Both images, then, involved topless women, with their skin partially painted. However, those posting the aboriginal women faced bans from Facebook, while the Kardashian image not only remained up, but went viral.

Company Considerations:

  • How can policies regarding nudity be written to take into account cultural and regional differences?
  • Is there a way to adequately determine if nudity falls into one of the qualified exemptions, such as “educational” use?
  • What would be an effective and scalable way to enable an appeals process that would allow users like Liddle to inform Facebook the nature of the content that resulted in her temporary suspension?

Issue Considerations:

  • Questions about moderating “nudity” have been challenging for many websites. Are there reasonable and scalable policies that can be put in place that adequately take context into account?
  • Many websites start out with a “no nudity” policy to avoid having to deal with adult material on their websites. What other factors should any website consider regarding why a more nuanced policy may make more sense?

Resolution: After this story got some attention, Liddle launched a petition asking Facebook to recognize that aboriginal women “practicing culture are not offensive.”

Facebook’s standards are a joke. They are blatantly racist, sexist and offensive. They show a complete lack of respect for the oldest continuing culture in the world. They also show that Facebook continually fails to address their own shortfalls in knowledge. Finally, they show that Facebook is more than willing to allow scurrilous bullying to continue rather than educate themselves. — Celeste Liddle

New Matilda requested comment from Facebook regarding the removal of the link to its story and were told that even if the sharing was for an “awareness campaign” Facebook still believed it should be removed because some audiences in Facebook’s “global community” would be “sensitive” to such content. The company also notes that in order to allow its content moderators to apply rules “uniformly” they sometimes need to be “more blunt than we would like.”

“We are aware that people sometimes share content containing nudity for reasons like awareness campaigns, artistic projects or cultural investigations. The reason we restrict the display of nudity is because some audiences within our global community may be sensitive to this type of content – particularly because of cultural background or age. In order to treat people fairly and respond to reports quickly, it is essential that we have policies in place that our global teams can apply uniformly and easily when reviewing content. As a result, our policies can sometimes be more blunt than we would like, and restrict content shared for legitimate purposes. We encourage people to share Celeste Liddle’s speech on Facebook by simply removing the image before posting it.”

Originally posted to the Trust & Safety Foundation website.

Posted on Techdirt - 17 November 2021 @ 03:30pm

Content Moderation Case Study: Game Developer Deals With Sexual Content Generated By Users And Its Own AI (2021)

Summary: Dealing with content moderation involving user generated content from humans is already quite tricky — but those challenges can reach a different level when artificial intelligence is generating content as well. While the cautionary tale of Microsoft’s AI chatbot Tay may be well known, other developers are still grappling with the challenges of moderating AI-generated content.

AI Dungeon wasn’t the first online text game to leverage the power of artificial intelligence. For nearly as long as gaming has been around, attempts have been made to pair players with algorithmically-generated content to create unique experiences.

AI Dungeon has proven incredibly popular with players, thanks to its use of powerful machine learning algorithms created by Open AI, the latest version of which substantially expands the input data and is capable of generating text that, in many cases, is indistinguishable from content created by humans.

For its first few months of existence, AI Dungeon used an older version of Open AI’s machine learning algorithm. It wasn’t until Open AI granted access to the most powerful version of this software (Generative Pre-Trained Transformer 3 [GPT-3]) that content problems began to develop.

As Tom Simonite reported for Wired, Open AI’s moderation of AI Dungeon input and interaction uncovered some disturbing content being crafted by players as well as its own AI.

A new monitoring system revealed that some players were typing words that caused the game to generate stories depicting sexual encounters involving children. OpenAI asked Latitude to take immediate action. “Content moderation decisions are difficult in some cases, but not this one,” OpenAI CEO Sam Altman said in a statement. “This is not the future for AI that any of us want.”

While Latitude (AI Dungeons’ developer) had limited moderation methods during its first few iterations, its new partnership with Open AI and the subsequent inappropriate content, made it impossible for Latitude to continue its limited moderation and allow this content to remain unmoderated. It was clear that the inappropriate content wasn’t always a case of users feeding input to the AI to lead it towards generating sexually abusive content. Some users reported seeing the AI generate sexual content on its own without any prompts from players. What may have been originally limited to a few users specifically seeking to push the AI towards creating questionable content had expanded due to the AI’s own behavior, which assumed all input sources were valid and usable when generating its own text.

Company Considerations:

  • How can content created by a tool specifically designed to iteratively generate content be effectively moderated to limit the generation of impermissible or unwanted content?
  • What should companies do to stave off the inevitability that their powerful algorithms will be used (and abused) in unexpected (or expected) ways? 
  • How should companies apply moderation standards to published content? How should these standards be applied to content that remains private and solely in the possession of the user?
  • How effective are blocklists when dealing with a program capable of generating an infinite amount of content in response to user interaction?

Issue Considerations:

  • What steps can be taken to ensure a powerful AI algorithm doesn’t become weaponized by users seeking to generate abusive content?

Resolution: AI Dungeon’s first response to Open AI’s concerns was to implement a blocklist that would prevent users from nudging the AI towards generating questionable content, as well as prevent the AI from creating this content in response to user interactions.

Unfortunately, this initial response generated a number of false positives and many users became angry once it was apparent that their private content was being subjected to keyword searches and read by moderators.

AI Dungeon’s creator made tweaks to filters in hopes of mitigating collateral damage. Finally, Latitude arrived at a solution that addressed over-blocking but still allowed it access to Open AI’s algorithm. This is from the developer’s latest update on AI Dungeon’s moderation efforts, published in mid-August 2021:

We’ve agreed upon a new approach with OpenAI that will allow us to shift AI Dungeon’s filtering to have fewer incorrect flags and allow users more freedom in their experience. The biggest change is that instead of being blocked from playing when input triggers OpenAI’s filter, those requests will be handled by our own AI models. This will allow users to continue playing without broader filters that go beyond Latitude’s content policies.

While the fix addressed the overblocking problem, it did create other issues for players, as AI Dungeon’s developer acknowledged in the same post. Users who were shunted to AI Dungeon’s AI would suffer lower performance due to slower processing. On the other hand, routing around Open AI’s filtering system would allow AI Dungeon users more flexibility when crafting stories and limit false flags and account suspensions.

Originally posted to the Trust & Safety Foundation website.

Posted on Techdirt - 10 November 2021 @ 03:41pm

Content Moderation Case Study: Electric Truck Company Uses Copyright Claims To Hide Criticism (2020)

Summary: There are many content moderation challenges that companies face, but complications arise when users or companies try to make use of copyright law as a tool to block criticism. In the US, the laws around content that allegedly infringes on a copyright holder’s rights are different than most other types of content, and that creates some interesting challenges in the content moderation space.

Specifically, under Section 512 of the Digital Millennium Copyright Act (DMCA), online service providers who do not wish to be held liable for user-posted material that infringes copyright need to take a few steps to be free of liability. Key among those steps is having a “notice-and-takedown” process, in which a copyright holder can notify the website of allegedly infringing material; and if the website removes access to the work, it cannot be held liable for the infringement.

This process creates a strong incentive for websites to remove content upon receiving a takedown notice, as doing so automatically protects the site. However, this strong incentive for the removal of content has also created a different kind of incentive: those who wish to have content removed from the internet can submit takedown notices claiming copyright infringement, even if the work does not infringe on copyright. This creates an interesting challenge for companies hosting content: determining when a copyright takedown notice has been submitted for illegitimate purposes.

In September of 2020, news was released that Nikola, an alternative energy truck company’s promotional video showing its new hydrogen fuel cell truck driving along a highway was false. A report by a research firm criticized the company, saying that the truck did not move under its own propulsion. As it turned out, the truck did not actually have a hydrogen fuel cell and was instead filmed rolling downhill; Nikola admitted that it had faked its promotional video. In Nikola’s response, it admits that the truck did not move on its own, but it still claimed that the original report was “false and defamatory.” While the response from Nikola does highlight areas where it disagrees with the way in which the research firm wrote about the company’s efforts, it does not identify any actual “false” statements of fact.

Soon after this, many YouTube creators who made videos about the situation discovered that their videos about the incident were being removed due to copyright claims from Nikola. While video creators did use some of the footage of the faked promotional video in their YouTube videos, they also noted that it was clearly fair use, as they were reporting on the controversy and just using a short snippet of Nikola’s faked promotional video, often presenting it in much longer videos with commentary.

When asked about the situation, Nikola and YouTube spokespeople seemed to give very different responses. Ars Technica’s Jon Brodkin posted the comments from each side by side:

“YouTube regularly identifies copyright violations of Nikola content and shares the lists of videos with us,” a Nikola spokesperson told Ars. “Based on YouTube’s information, our initial action was to submit takedown requests to remove the content that was used without our permission. We will continue to evaluate flagged videos on a case-by-case basis.”

YouTube offered a different description, saying that Nikola simply took advantage of the Copyright Match Tool that’s available to people in the YouTube Partner Program.

“Nikola has access to our copyright match tool, which does not automatically remove any videos,” YouTube told the [Financial Times]. “Users must fill out a copyright removal request form, and when doing so we remind them to consider exceptions to copyright law. Anyone who believes their reuse of a video or segment is protected by fair use can file a counter-notice.”

Company Considerations:

  • Given the potential liability from not taking down an infringing video, how much should YouTube investigate whether or not a copyright claim is legitimate?
  • Is there a scalable process that will allow the company to review copyright takedowns to determine whether or not they are seeking to take down content for unrelated reasons?
  • What kind of review process should be put in place to handle situations, like what happened with Nikola, where a set of videos are reported as copyright violations and are taken down because those videos featured the copyrighted material as news or commentary, and the copyright infringement takedown requests were improper?
  • Improper takedowns can reflect poorly on the internet platform that removes the content, but often make sense to avoid potential liability. Are there better ways to balance these two competing pressures?

Issue Considerations:

  • Copyright is one of the few laws in the US that can be used to pressure a website to take down content. Given that the incentives support both overblocking and false reporting, are there better approaches that might protect speech, while giving companies more ability to investigate the legitimacy of infringement claims?
  • Under the current DMCA 512 structure, users can file a counternotice with the website, but the copyright holder is also informed of this and given 10 days to file a lawsuit. The threat of a lawsuit often disincentivizes counternotices. Are there better systems enabling those who feel wrongfully targeted to express their concerns about a copyright claim?

Resolution: After the press picked up on the story of these questionable takedown notices, many of the YouTube creators found that the takedown demands had been dropped by Nikola.

In July of 2021, nine months after the news broke of the faked videos, Nikola’s founder Trevor Milton was charged with securities fraud by the SEC for the faked videos.

Originally posted to the Trust & Safety Foundation website.

Posted on Techdirt - 4 November 2021 @ 03:43pm

Content Moderation Case Study: Facebook Struggles To Correctly Moderate The Word 'Hoe' (2021)

Summary: One of the many challenges with content moderation is the flexibility of language. When applying blocklists — a list of prohibited terms considered not appropriate for the platform — moderators need to consider innocuous uses of words that, when removed from their context, appear to be violations of the platform’s terms of use.

Multiple platforms have run into the phenomenon known as the “Scunthorpe problem.” In this famous case, a town whose name no one would ever mistake for offensive was deemed offensive by moderation blocklists simply because within the name of the town is the word “cunt” which many blocklists forbids.

Deploying automated blocklists can be even more challenging when dealing with specialized or niche content, which may use certain terms that are offensive outside of this specific context, but are essential to discussing and understanding the relevant subject matter. A paleontologists’ conference was derailed when the moderation blocklist made it impossible for participants to use words like “bone,” “pubic,” “stream,” and “beaver.”

Facebook has worked continuously to refine its moderation processes, but it still occasionally makes the wrong call when it comes to their blocklists. In January 2021, residents of (and visitors to) a Devon, England landmark were surprised to find their posts and comments vanishing from the site. After a little investigation, it became clear Facebook was deleting posts containing references to the landmark known as Plymouth Hoe.

In addition to being the name of a common garden tool (more on that in a moment), “hoe” also refers to a “sloping ridge shaped like an inverted foot or heel,” such as Plymouth Hoe, which is known locally as the Hoe. Users were temporarily forced to self-censor the harmless term to avoid moderation, either by adding unnecessary punctuation or dropping the “h.” It appeared Facebook’s automated processes believed these comments and posts were using a derogatory term for a romantic partner who is only in a relationship to better their own financial position.

Facebook soon apologized for the moderation error and stated it was “taking steps to rectify the error” and figure out what caused the mistaken moderation in the first place. Problem solved?

Not really.

The same problem popped up again, this time affecting a New York gardening group. WNY Gardeners, a group with more than 8,000 members, is the latest to be affected by Facebook’s “hoe” pruning. A member responded to the prompt “most loved & indispensable weeding tool” with “Push pull hoe!” Not long after that, the member was informed by Facebook that the comment violated the site’s policy on bullying and harassment.

Company Considerations:

  • How could blocklists and keyword searches be better utilized to detect and remove violations of site policies? 
  • How much collateral damage from automated moderation should be considered acceptable? Is this an acceptable trade-off for lower moderation costs, which often relies on more automated moderation and fewer human moderators?
  • Can AI-based moderation more reliably detect actual violations (rather than innocuous uses of blocklisted terms) as the technology advances? What are the trade-offs with AI-based moderation tools as compared to simple blocklists? 
  • What mitigation measures might be put in place to deal with a blocklist that catches words with different meanings depending on context?
  • Who should be in charge of reviewing a blocklist and how frequently should it be updated? 

Issue Considerations:

  • Does prohibiting words like “hoe” make a significant dent in online harassment and abuse? Does the tech have the capability to “catch up” (or surpass) the ability of humans to route around moderation efforts?
  • Should more resources go to staffing human moderators in order to prevent errors and/or allow for a more robust challenge process that allows content to remain “live” until the challenge process has concluded?
  • What ways might automation and human reviewers be used in combination to avoid the more egregious automated blocklist mistakes?

Resolution: Once again, Facebook has apologized for not recognizing the word “hoe” in contexts where it’s appropriate to use. But after two highly-publicized incidents in less than a year — both involving the same word — Facebook has added human moderators to backstop automated calls on flagged terms like these in order to prevent unjustified removals of posts, accounts, or groups.

Originally posted to the Trust & Safety Foundation website.

Posted on Techdirt - 27 October 2021 @ 03:32pm

Content Moderation Case Study: Linkedin Blocks Access To Journalist Profiles In China (2021)

Summary: A major challenge for global internet companies is figuring how to deal with different rules and regulations within different countries. This has proven especially difficult for internet companies looking to operate in China — a country in which many of the most popular global websites are blocked.

In 2015, there was an article highlighting how companies like Evernote and LinkedIn had avoided getting blocked in China, mainly by complying with the Chinese government’s demands that they moderate certain content. In that article, LinkedIn’s then-CEO Jeff Weiner noted:

“We’re expecting there will be requests to filter content,” he said. “We are strongly in support of freedom of expression and we are opposed to censorship,” he said, but “that’s going to be necessary for us to achieve the kind of scale that we’d like to be able to deliver to our membership.”

Swedish journalist Jojje Olsson tweeted the article when it came out. Six years later LinkedIn informed Olsson that his own LinkedIn profile would no longer be available in China after referencing the Tiananmen square massacre in his profile.

In Olsson’s tweet, he explains that his LinkedIn profile mentions that for his degree, he wrote an essay about the Tiananmen Square massacre. It quickly became clear that LinkedIn was in the process of blocking access to multiple journalists’ and academics’ accounts in China, including CNN Beijing bureau chief Steve Jiang and the editor-in-chief of the Taiwan Sentinel, J. Michael Cole. The Wall Street Journal found at least 10 other LinkedIn accounts that were blocked in China around the same time, and highlighted that LinkedIn officials were reprimanded in March of 2021 for keeping certain accounts available in China.

China’s internet regulator summoned LinkedIn officials in March to tell them to better regulate its content, according to people familiar with the matter. The social-networking site was given 30 days to clean up the content and promised to better regulate its site going forward, the people said.

Shortly after, LinkedIn said in a statement on its website that it would be pausing new member sign-ups as the platform worked “to ensure we remain in compliance with local law.” — Liza Lin, Wall Street Journal

The NY Times report on that meeting noted that the 30-day pause on sign-ups was part of what Chinese officials ordered.

The users whose profiles were blocked received a notice from Linkedin about the block, saying “We will work with you to minimize the impact and can review your profile’s accessibility within China if you update the relevant sections of the profiles,” but also notes “the decision whether to update your profile is yours.” The notice also includes this paragraph:

While we strongly support freedom of expression, we recognized when we launched that we would need to adhere to the requirements of the Chinese government in order to operate in China. As a reminder, your profile will remain viewable throughout the rest of the countries in which LinkedIn is available.

It appears that LinkedIn was also directly removing some specific content as well. Former journalist Peter Humphrey told Bloomberg News that LinkedIn informed him that it had completely removed certain comments he made criticizing the Chinese government.

Company Considerations:

  • How important is it to remain accessible in China?
  • What compromises are worth making to remain accessible in China or other countries?
  • If the company agrees to take down, or block access to, certain content to appease government demands, how should those decisions be communicated to impacted users?
  • Under what conditions, if any, will the company push back on overbroad demands to block content in China?

Issue Considerations:

  • Is a censored, but still mostly available, US-based service better to be available than to have the entire service blocked in China?
  • Local regulations differ across every country. What kind of framework should a company use to determine where they draw the line, and what compromises they will agree to?

Resolution: Since the initial flurry of notices that got attention from May through July of 2021, it appears that even more journalists have found their profiles blocked in China. In September, Sophia Yan, the China correspondent for the UK’s Telegraph, noted that her LinkedIn profile was now blocked in China. In replying to Yan’s tweet, Liza Lin, the Wall Street Journal’s China correspondent, and author of the article quoted earlier discussing LinkedIn officials being reprimanded by Chinese officials, noted that she too had her profile blocked in China.

LinkedIn, for its part, has continued to make similar statements throughout, saying that it supports the principles of free speech but that in order to continue operating in China, it is required by the government to block access to these accounts.

Update: Just weeks after this case study was originally published, and LinkedIn was called out for even more such activity, the company announced that it was mostly exiting the country, as the demands for censorship were becoming too much.

Originally posted to the Trust & Safety Foundation website.

Posted on Techdirt - 20 October 2021 @ 04:12pm

Content Moderation Case Studies: Snapchat Disables GIPHY Integration After Racist 'Sticker' Is Discovered (2018)

Summary: Snapchat debuted to immediate success a decade ago, drawing in millions of users with its playful take on instant messaging that combined photos and short videos with a large selection of filters and “stickers.” Stickers are graphics that can be applied to messages, allowing users to punch up their presentations (so to speak).

Snapchat’s innovations in the messaging space proved incredibly popular, moving Snapchat from upstart to major player in a few short years. It also created more headaches for moderators as sent messages soared past millions per day to billions.

Continuing its expansion of user options, Snapchat announced its integration with Giphy, a large online repository of GIFs, in February 2018. This gave users access to Giphy’s library of images to use as stickers in messages.

But the addition of thousands of images to billions of messages quickly resulted in an unforeseen problem. In early March of 2018, Snapchat users reported a search of the GIPHY image database for the word “crime” surfaced a racist sticker, as reported by Josh Constine for TechCrunch:

“We first reported Instagram was building a GIPHY integration back in January before it launched a week later, with Snapchat adding a similar feature in February. But it wasn’t long before things went wrong. First spotted by a user in the U.K. around March 8th, the GIF included a racial slur.” — Josh Constine, TechCrunch

Both platforms immediately pulled the plug on the integration while they sorted things out with GIPHY.

Company Considerations:

  • What measures can be put in place to prevent moderation problems from moving from one platform to another during cross-platform integration?
  • What steps should be taken prior to launch to integrate moderation efforts between platforms? 
  • What can “upline” content providers do to ensure content moving from their platforms to others meets the content standards of the “downline” platforms? 

Issue Considerations:

  • What procedures aid in facilitating cross-platform moderation? 
  • Which party should have final say on moderation efforts, the content provider or the content user?

ResolutionInstagram was the first to reinstate its connection with GIPHY, promising to use more moderators to examine incoming content from the image site:

“We’ve been in close contact with GIPHY throughout this process and we’re confident that they have put measures in place to ensure that Instagram users have a good experience” an Instagram spokesperson told TechCrunch.

GIPHY offered its own apology for the racist image, blaming the slipup on a bug in its filters. Here’s what GIPHY’s spokesperson told Gizmodo:

After investigation of the incident, this sticker was available due to a bug in our content moderation filters specifically affecting GIF stickers.

We have fixed the bug and have re-moderated all of the GIF stickers in our library.

The GIPHY staff is also further reviewing every GIF sticker by hand and should be finished shortly.

Snapchat was the last to reinstate its connection to GIPHY, stating it was working directly with the site to revamp both moderation systems to ensure offensive content would be prevented from being uploaded to GIPHY and/or making the leap to connected social media services.

Originally published to the Trust & Safety Foundation website.

More posts from >>