Just a few weeks ago, we pointed out that the purpose of a trust & safety team is not, as Elon Musk falsely claims, to “censor” users, but rather to make sure they’re safe on the site. We were highlighting this in the context of Elon’s site disguising posted links in a manner that made it easier for scammers to trick people into thinking they were going to a reputable site, when they were not.
It looks like things have gotten even worse on that front. As Matt Binder pointed out at Mashable, ExTwitter has been experimenting with forcing the word “twitter” to change to “x” throughout the app. This is because, since Musk’s hasty change of names, which it was clear the company was unable to prepare for, the word “twitter” still remains all over the app. So, it seems like Musk was getting fed up of being reminded of the old app (you know, the one that actually worked most of the time and didn’t fall over every few days).
Scarily, this auto-change appears to happen even in the display of links in the iOS app, leading to crazy situations where people post domains with ‘twitter” in them, and ExTwitter makes them appear like they’re saying “x.”
But, this creates… a pretty massive security problem as the article describes:
Let’s say someone owns the domain name “NetfliTwitter.com.” Why would they own that domain name? Because if X is automatically changing anything that includes “Twitter.com” to “X.com,” then that means posting “NetfliTwitter.com” on X would make it appear in posts as “Netflix.com,” the popular movie streaming service. And if a user clicked the linked “Netflix.com” text that appears in that post, it would really take them to “NetfliTwitter.com.” Because while X is changing the text that the user wrote, the URL it links and directs to remains the same as the user posted.
This is a dream scenario for someone looking to steal passwords through phishing campaigns.
Luckily, two of the most popular domains that include a prominent “x” that could be used in this manner for phishing have been grabbed by good samaritans (not ExTwitter, of course) to prevent them from being abused:
The example I just provided isn’t a hypothetical either. Some users on X noticed this very problem and found that it could quickly be utilized by scammers, hackers, and other bad actors. X user @yuyu0127_ quickly registered the domain name “NetfliTwitter.com” in order to prevent it from being weaponized and put up a warning page on the URL about the potential issues in X’s changes.
“This domain has been acquired to prevent its use for malicious purposes,” reads the headline text on “NetfliTwitter.com.”
Another domain name “seTwitter.com” was also registered due to its potential to be exploited as X would then change how the URL is viewed on the platform to “sex.com.” The X user, @amasato_mochi, who registered that domain name, also put up a warning page in order to put a spotlight on the issue.
“Please be very careful not to access suspicious URLs,” reads seTwitter.com. “I will hold onto this domain for a year to prevent any harm.”
But still, this is a hugely problematic “feature,” and the kind of thing that a good trust & safety team would have recognized before the product ever rolled off the line and was handed to everyone to abuse.
One key job of trust & safety is to red-team new features to think about how they might be abused and to prevent such abuses before they happen. But when you fire all the experienced trust & safety folks, you’re going to continue to make these kinds of mistakes that make users way less safe, leading to significantly decreased trust.
Forget Mars colonies and self-driving cars. Elon Musk’s greatest challenge yet? Defeating Twitter’s relentless ‘pussy in bio’ spam army. And let’s just say, it’s not going well.
It has really been quite incredible to watch Elon rediscover some of the basics of trust & safety best practices (though while consistently messing it up) as ExTwitter just gets worse and worse. Before he had even taken over Twitter, he insisted he had two priorities: stopping spam and restoring free speech. Of course, some of us pointed out that those two things were in conflict.
Spam fighting is a core part of trust & safety, but Elon insisted that he knew better, fired basically everyone with any knowledge on the subject, and then repeatedly suggested that he had figured out how to solve it, only to see spam get worse and worse on ExTwitter, to the point that users are getting really frustrated.
Most recently, the spam has been in the form of posts with the following, or some variation on this: “░M░Y░P░U░S░S░Y░I░N░B░I░O░.” It’s been so unavoidable on ExTwitter that it’s become a meme.
John Herrman, over at New York Mag’s Intelligencer, has a ridiculously long investigation determining who is behind all that spam (basically a company doing the modern equivalent of an old phone sex line). But the reason the article caught my attention was that Herrman kicks off the article by highlighting Elon’s evolution on fighting spam. It’s kinda glorious:
If our twitter bid succeeds, we will defeat the spam bots or die trying,” wroteElon Musk in May 2022. “The bots are in for a surprise tomorrow,” he threatened shortly after the purchase closed. A new subscription service, he claimed, would “destroy the bots” with his new “anti-bot bots.” In 2023, the threat seemed to evolve. “We’re trying hard to stop bots & trolls on this platform,” he wrote in July. “Fighting bot and troll farms is hard,” he conceded. “The bot wars continue,” he posted in an August update. January of this year brought a shift in tone: “Bots are the devil (sigh).” In March, more signals of an extended, brutal campaign: “Stopping crypto/porn spam bots is not easy, but we’re working on it.”
Yeah, so, that was kinda the point of my old speed run post. One of the things that you often learn regarding trust & safety is that there are often good reasons why things are done, even if the end results are messy, and assuming you can magically do better with none of the experience or understanding of the tradeoffs means you’re going to make a ton of mistakes.
Now, it’s no surprise that Musk has failed to stop spam on the platform. Or even that it’s gotten significantly worse of a problem. Content moderation at scale remains impossible to do well.
But, you know, Musk might have done a better job if he hadn’t fired everyone at the company who actually understood how to fight spam and replaced them with his own highly misguided “intuition” on where the spam was coming from. Content moderation is a constant struggle, and spam is a part of that. If Musk could reflect for just one moment, it might be nice if he realized how stupidly over confident he was that he would be able to solve it. And also, how wrong he was to insist that the previous management wasn’t taking the issue seriously.
I know that some people, including Elon Musk recently, have claimed that “trust & safety” is some sort of “euphemism for censorship.”
That is not true, and has never been true. The role of trust & safety has always been about building trust in the platform and making sure that users feel safe on the platform. This goes way, way beyond “ensuring compliance with the laws that already exist.”
Let’s give just one example that might highlight why you shouldn’t fire or drive out your entire trust & safety team and falsely claim that they were only there to “censor” political speech.
Just as a purely hypothetical example, let’s say that your CEO decides on a whim that showing the destination of URLs to news articles is aesthetically unpleasing when people post them. He then orders the remaining team to remove the URLs and headlines. After that fails miserably, he modifies it to make it a little clearer where the links are going to. Sort of.
But, because there’s no real trust & safety team, and whoever is left is told only to focus on “compliance with the laws that already exist,” you have no one to red team how all of this might be abused.
Users of the social media platform X (formerly Twitter) have often been left puzzled when they click on a post with an external link but arrive at an entirely unexpected website from the one displayed in the post.
A Twitter ad spotted below by a security researcher shows forbes.com as its destination but instead takes you to a Telegram account purportedly promoting crypto scams.
As Bleeping Computer describes (and originally called out by Will Dormann), a “verified” (lol) account posts something as an ad with a link that looks like it’s going to a reputable source (in this case, Forbes):
The fact that it was boosted as an ad suggests why it has over a million views and only 191 likes. Mucho organic. But, if you click on it, it takes you to some sketchy Telegram crypto scam.
How does that happen? Well, as Bleeping Computer notes, the link actually takes you to another site “joinchannelnow” (not sharing the TLD). This site then checks your user-agent and determines where to send you. If it thinks you’re a human, it sends you to the crypto shit on Telegram. If it thinks you’re a bot from Twitter trying to figure out the ultimate destination to display it… it sends you to a random Forbes article.
And, of course, this makes it all very, very ripe for scamming, whether phishing or otherwise. Much trust. Very safety.
Of course, since this is all entirely hypothetical, I’m just using it as an example of the kind of thing that a trust & safety team would likely red team and explore how such a system might be abused, and demonstrating how that role handles a hell of a lot of other things that have nothing to do with “political censorship.”
After all, we’re talking about a site where the “new owner” insisted he had to buy the place in order to stomp out scams. Given that, it would be absolutely ridiculous to fire all your trust & safety people. It would also be ridiculous to claim that they were just there for censorship, and you’re only enforcing the laws now… while then enabling scammers to take advantage of gullible people by (1) making it easy to let any old scam account get “verified,” (2) allowing them to post shit links to scam groups, and (3) enabling them to trick your system into telling people the link is to a more reputable site.
I mean, that would never happen, right? Not if you understood how trust & safety works. Especially when you have a (hypothetically) genius, visionary CEO who really is knocking down those censorship walls, and surely knows exactly what he’s doing.
And, really, if it did happen, I guess you’d deserve Lifehacker telling its users “the best way to stay safe on X is to stop using X.” But, of course, it wouldn’t happen. Because anyone with more than a few working brain cells would easily know that you need to actually have a trust & safety team that paid attention to this stuff. Right?
When Twitter first launched what it called “Birdwatch,” I was hopeful that it would turn into a useful alternative approach to helping with trust & safety/content moderation questions, but I noted that there were many open questions, in particular with how it would deal with malicious actors seeking to game the system. When Elon took over Twitter, he really seemed to embrace Birdwatch, though he changed the name to the pointlessly boring “Community Notes.”
I still think the concept is a good one, and think it’s one of Elon’s few good moves. I think other social media sites should experiment with some similar ideas as well.
The problem, though, is that Elon seems to think that Community Notes is an effective replacement for a comprehensive trust & safety program. At the heart of so many of Elon’s decisions in firing the vast majority of the company’s trust & safety staff was that “Community Notes can handle it.”
As we’re in the midst of a series of major crises around the globe, where the flow of information has proven incredibly important, one thing we’re clearly learning is that Community Notes is not up to the task. Just to drive this point home, over the weekend Elon himself posted some fucking nonsense (as he’s prone to do) and many hours later Community Notes pointed out it was hogwash. Elon, as he’s done in the past when he’s been “Noted,” claimed he was happy it happened to himself… before claiming that his post was “obviously a joke meme” and that “there is more than a grain of truth to it.”
So, first of all, there isn’t “more than a grain of truth to it.” The whole thing is simply false. But, more importantly, if you look at the top replies to his “obviously a joke meme,” suggests that Elon’s biggest fans did not, even remotely, think that this was “obviously a joke meme,” but rather took it entirely seriously, cheering him on for “telling the truth.” Here’s just one of the top replies to his original tweet:
Also, it took quite some time for the note to appear on Elon’s account. And, look, content moderation at scale is impossible to do well and all that, but Community Notes seems like the exact wrong approach in situations like this one. Especially at a time when the accounts pushing out the most viewed news these days seem to be made up by a combination of grifters and idiots:
Online we have seen many users of X describe their experience of this crisis as different. Some of that may result from the more ambiguous nature of the larger conflict, especially as the news cycle moves from the unambiguous horror of the initial attack to concerns about Israel’s response. However, our investigation here suggests an additional factor: in Musk’s short tenure as owner of the platform, a new set of news elites has emerged. These elites post frequently, many sharing unvetted content and emotionally charged media. While sharing no single political ideology, many embrace a similar culture of rapid production of unlinked or ambiguously sourced content, embracing a “firehose of media” ethos that places the onus of verification on the end-user. This occurs in an environment that has been shorn of many of the “credibility signals” that served to ground users in the past — checkmarks that indicated notability, fact-checks distributed through Twitter Trends, and Twitter/X-based labeling of deceptive content. Even fundamental affordances of the web — such as simple sourcing through links — have been devalued by the platform, and, perhaps as a result, by the new elites that now direct its users’ attention.
Leaving aside the significant concern of taking away professional, trained trust & safety employees, and replacing them with random (often hand-picked) untrained volunteers, there are serious concerns coming to light about how Community Notes actually works in practice.
Multiple reports have come out lately highlighting the limitations of Community Notes on important breaking news in the midst of various conflicts around the world, where you have malicious actors seeking to deliberately spread misinformation. A report at Wired found that Community Notes is actually making some of the problems worse, rather than better.
On Saturday, the company wrote on its own platform that “notes across the platform are now being seen tens of millions of times per day, generating north of 85 million impressions in the last week.” It added that thousands of new contributors had been enrolled in the system. However, a WIRED investigation found that Community Notes appears to be not functioning as designed, may be vulnerable to coordinated manipulation by outside groups, and lacks transparency about how notes are approved. Sources also claim that it is filled with in-fighting and disinformation, and there appears to be no real oversight from the company itself.
“I understand why they do it, but it doesn’t do anything like what they say it does,” one Community Notes contributor tells WIRED. “It is prone to manipulation, and it is far too slow and cumbersome. It serves no purpose as far as I can see. I think it’s probably making the disinformation worse, to be honest.”
The report isn’t just based on random Community Notes users, but looking more closely at how the program works, and the ability for it to be gamed. Wired found that it wasn’t difficult to set up multiple accounts controlled by one person which all had access to Community Notes, meaning that you could manipulate support for a position with just a small group of users controlling multiple accounts.
It also points to earlier (pre-Elon) research that showed that (then) Birdwatch wasn’t used nearly as much for standard fact checking, but was used in political debates by users who disagreed politically with someone who had tweeted.
Back during the summer, the Poynter Institute had a good analysis of the limitations of Community Notes for dealing with real-time misinformation campaigns during crises. Specifically, the design of the current Community Notes has some, well, questionable assumptions built in. Apparently, it looks over your tweeting history and assigns you to a camp as being either “left” or “right” and then only allows a Community Note to go public if enough of the “left” people and the “right” people agree on a note.
“It has to have ideological consensus,” he said. “That means people on the left and people on the right have to agree that that note must be appended to that tweet.”
Essentially, it requires a “cross-ideological agreement on truth,” and in an increasingly partisan environment, achieving that consensus is almost impossible, he said.
Another complicating factor is the fact that a Twitter algorithm is looking at a user’s past behavior to determine their political leanings, Mahadevan said. Twitter waits until a similar number of people on the political right and left have agreed to attach a public Community Note to a tweet.
While that may work on issues where there isn’t any kind of culture war, it’s completely useless for culture war issues, where plenty of disinformation flows. Indeed, the Poynter report notes that a huge percentage of the highest rated Community Notes inside the Community Notes system are never seen by the public because they don’t have “cross-ideological agreement.”
The problem is that regular Twitter users might never see that note. Sixty percent of the most-rated notes are not public, meaning the Community Notes on “the tweets that most need a Community Note” aren’t public, Mahadevan said.
The setup with “cross-ideological” consensus basically seems almost perfectly designed to make sure that the absolute worst nonsense will never have Community Notes shown publicly.
Meanwhile, a report from NBC News also highlights how even when Community Notes is able to help debunk false information, it often comes way too late.
NBC News focused on two prominent pieces of Israel-Hamas misinformation that have already been debunked: a fake White House news release that was posted to X claiming the Biden administration had granted Israel $8 billion in emergency aid and false reports that St. Porphyrius Orthodox Church in Gaza was destroyed.
Only 8% of 120 posts related to those stories had published community notes, while 26% had unpublished notes from volunteers that had yet to be approved. About two-thirds of the top posts NBC News reviewed had no proposed or published Community Notes on them.
The findings echo what a Community Notes volunteer said was X’s lack of response to efforts to debunk misleading posts.
“All weekend we were furiously vetting, writing, and approving Community Notes on hundreds of posts which were demonstrably fake news,” Kim Picazio, a Community Notes volunteer, wrote on Instagram’s Threads. “It took 2+ days for the backroom to press whatever button to finally make all our warnings publicly viewable. By that time… You know the rest of that sentence.”
And when the Community Notes don’t show up until much later, a ton of nonsense can spread:
A post about the debunked White House news release published by a verified account had nearly 500,000 views and no proposed or appended note Tuesday afternoon.The Community Notes system also showed that a user tried to submit a fact-check Sunday on another post including the same known misinformation but that it had yet to be approved, saying, “Needs more ratings.” The post had accrued 80,000 views since Sunday.
In a search for St. Porphyrius Orthodox Church in Gaza, only five Community Notes had been applied to the top 42 posts echoing the debunked misinformation. Several posts from verified users with no notes repeated the claim and got over 100,000 views, while 13 Community Notes had been proposed on posts of the debunked claims but had not yet been approved for publishing.
During the first 5 days of the conflict, just 438 Community Notes (attached to 309 posts from 223 unique accounts) earned a “HELPFUL” rating and ended up being displayed publicly to users. Although it’s impossible to know what percentage of content about the war this represents, the fact that trending topics related to the conflict have routinely involved hundreds of thousands or even millions of posts suggests that a few hundred posts is just a drop in the bucket. The visible notes were generally attached to popular posts — the 309 posts in question earned a combined total of 2147081 likes, an average of 6948 likes per post. The majority of the posts that earned Community Notes (222 of 309 posts, 71.8%) came from paid X Premium/Twitter Blue subscribers, and the majority of the accounts posting them (147 of 223, 65.9%) are X Premium subscribers, who are potentially earning a share of X’s ad revenue based on the number of times their posts are seen and who therefore have a financial motive to never delete misleading content. (Overall, roughly 7% of posts that received Community Notes were deleted during the period studied, but there’s no reliable way of knowing how many of these posts were related to the Israel/Hamas war.)
Again, I really like the concept of Community Notes. I think it’s a very useful tool — and one example (of many) of trust & safety tools beyond simply “taking down” content. But it needs to be part of a wider strategy, not the only strategy. And, the program can’t be setup with such a huge blindspot for culture war issues.
But, that’s exactly how things currently work, and it’s a shame, in part because I fear it’s going to discourage others from creating their own versions of Community Notes.
As you hopefully know by now, earlier this week we launched our new game, Trust & Safety Tycoon. It’s a free, browser-based game (playable on desktop or mobile, though we recommend desktop for the best experience) that puts you in the shoes of the head of trust and safety at a rapidly growing social media platform, where you must face all the difficult decisions that entails. As we did for our past games Moderator Mayhem and Startup Trail, Mike, myself, and our game design partner Randy Lubin of Leveraged Play sat down after the release to record a podcast episode all about the game, our design process, and the reaction from players so far.
Elon Musk keeps insisting that stopping spam bots is a huge priority. After all, he said he’d either stop them or die trying.
And, apparently one way to try to stop spam is to stop allowing people to report spam. I’m guessing this is just a mistake on a site that has massive errors pretty much every day, but it’s still crazy. Apparently, exTwitter no longer lets you actually report spam. If you try, it automatically misclassifies it as something else, and then you get a notice that because it’s not that something else, it remains online.
Spam problems solved!
I first spotted this over on Bluesky from user Q.H. Stone, who noted that if you now try to report spam on ExTwitter, rather than saying “we received your report for spam,” it instead says (incorrectly) “we received your report for sensitive media.”
Then, one of the three remaining trust & safety hamsters on the wheel looks at the spam, looks at the report, and says “this isn’t sensitive media,” so nothing to do here:
While I hid the user account in this case, I will note that (of course) it is paying for Twitter Blue, Twitter Verified, Verified, X Premium, so perhaps that’s why it’s not getting removed for spam. But, rest assured, all that account is doing is spamming people with crypto spam. But, apparently, it’s okay, because it’s not “sensitive media.”
Today we’re super excited to launch Trust & Safety Tycoon, a video game simulating what it’s like to run a trust & safety team at a fictitious, rapidly scaling social media company called Yapper*. If you’ve ever wanted to see how you’d do as the head of trust & safety, now is your chance.
Over the last few years we’ve spent a lot of time and effort trying to educate people on the concept of “trust & safety” at an internet platform, as it’s something that many people have strong opinions about, but very little direct experience with themselves. It’s why we ran the content moderation case study series for a while. It’s why we’ve written thousands of articles about content moderation and trust & safety here on Techdirt. And it’s why earlier this year, we created the Moderator Mayhem mobile game, to give people a taste of what’s it’s like to be a front line content moderator.
And now we’re launching Trust & Safety Tycoon. This browser-based game (which works on both mobile and desktop browsers) is a different kind of trust & safety simulator from the last game. While that game was about being a front line content moderator, this game is about actually running a trust & safety team for a rapidly scaling social media startup. You have to set policies, deal with various dilemmas, face internal and external pressures, weigh tradeoffs, determine resource allocation and more, all while trying to keep your website from descending into a cesspit of hate, driving away users and advertisers.
As with Moderator Mayhem, our goal with Trust & Safety Tycoon is to help more people better understand the kinds of dilemmas, nuances, and challenges of handling trust & safety these days. Too many people have very strong feelings about how it should work, despite having little experience in how things actually work.
Trust & Safety Tycoon gives people a chance to try out their own ideas on how to manage a trust & safety program at a startup and see how well it works. There is no “right” answer with any of this, but not all decisions will lead to positive outcomes for you, or the company you’re working for. How will you handle concerns from your team, the CEO, or the media? How will you handle a crisis when your team is already overwhelmed? Will you cause an international incident that gets you called to answer questions from Congress?
This game was developed with support from the Hewlett Foundation, and in association with the Atlantic Council’s recent Taskforce for a Trustworthy Future Web, which we spoke about earlier this year when its report on scaling trust came out. If you didn’t want to read that (excellent) 150 page report, think of this game as a somewhat more fun way of thinking through many of the same ideas.
The Director of the Democracy & Tech initiative at the Atlantic Council, Rose Jackson, told us, “As more people tune into questions around online harms, tech company decisionmaking, and how to keep the internet safe, this game is an invaluable resource to help people understand the tradeoffs at play. Anyone who cares about what they see online will benefit from this fun and thought-provoking experience.”
“We’re excited to support this innovative approach to helping people understand the hard trade-offs involved in trust and safety—not just content moderation, but also building tools, running global operations, and finding the diverse group of talented people needed to do all these things” said Eli Sugarman, director of the Hewlett Foundation’s Cyber Initiative. “Mike Masnick and his team, and the group of experts who advised them, have done a great job of creating a fun and engaging game about the challenges involved in keeping people safe online.”
As with our previous games, this game was a collaboration between our Copia Gaming effort and Randy Lubin at Leveraged Play.
* Any similarities to any real social media companies are purely coincidental entirely on purpose.
The trust and safety conversation tends to focus on the huge platforms, and the millions of smaller websites (some still quite big!) get ignored. But those websites have trust and safety needs too, and they use a lot of different tools to meet them. Most of these tools are proprietary, but there’s a growing push to build more open source tooling for the purpose, which was discussed by Derek Slater in a recent Atlantic Council report. This week, Derek joins us on the podcast to talk about the problems that open source trust and safety tools can solve.
I continue to be fascinated in watching how the various decentralized protocol-based social media systems are evolving — in particular how they’re dealing with the challenges of content moderation. There was an interesting discussion a recently on nostr over whether or not moderation should be best handled by relays or clients*.
ActivityPub has, of course, continued to move forward with its systems of moderation handled at each instance level, combined with the threat of “defederation” being used to keep “bad” instances in line (or cut off from parts of the network). That’s worked surprisingly well in some cases, but is also facing a few challenges, as there have been complaints about some of the largest instances, and now that Meta is planning to release an ActivityPub-compatible offering, there’s a weird push to make some instances promise to defederate from any Meta offering immediately.
But, again, Bluesky may be where the most interesting discussions on decentralized trust & safety and moderation are happening. A few months ago, we wrote about their plans for decentralized composable moderation, and recently they released some thoughts on how you can handle moderation in a public commons.
The goal of Bluesky is to turn social media into a shared public commons. We don’t want to own people’s social graphs or communities. We want to be a tool that helps communities own and govern themselves.
The reason we focus on communities is that for an open commons to work, there needs to be some sort of structure that protects the people who participate. Safety can’t just be left up to each individual to deal with on their own. The burden this puts on people — especially those who are most vulnerable to online abuse and harassment — is too high. It also doesn’t mirror how things work in the real world: we form groups and communities so that we can help each other. The tooling we’re building for moderation tries to take into consideration how social spaces are formed and shaped through communities.
Somewhat importantly, they make it clear that they don’t have all the answers (no one does!), but it’s really interesting to see them discussing this openly, and publicly, and asking for thoughts and feedback as they move forward. To me, the thing that stands out is that the ideas that are presented obviously involved a lot of thought (to the point that I haven’t fully wrapped my head around some of the different proposals, some of which seem clever, while others may need a bit more baking before they fully make sense).
Historically, trust & moderation decisions come in two forms: formed on high in a centralized system in which little is discussed publicly, and people are left trying to sort through what’s actually happening, or in an entirely distributed manner in which things often spring up ad hoc out of need (see: Usenet killfiles), which often run into problems later on.
The Bluesky folks are trying to think about something that is a more hybrid approach, in which the system itself is design to enable communities to better manage things, not just one giant opaque centralized control bunker, and not putting all the weight on users which is unfair to many (especially the targets of abuse and harassment).
I think this kind of vision seems exactly the right one for an organization like Bluesky to have:
A company is an efficient structure for building out a cohesive vision of how things should work, but locking users into our systems would be antithetical to our mission. An open commons can’t be governed at the sole discretion of one global company. We offer services like professional moderators so that we can help protect people and provide a good experience, but we shouldn’t exert total control over everyone’s experience, for all time, with no alternative. Users should be able to walk away from us without walking away from their social lives.
The reason we’re building in decentralization is because we observed that business interests and the open web have a habit of coming into conflict. Third-party developers often get locked out. Moderation policies come into conflict with the diverse interests and needs of different groups of users. Ads push towards algorithms that optimize for engagement. It’s a systemic problem that keeps playing out as centralized social media companies rise and fall.
On Bluesky itself, the lead developer, Paul Frazee noted that they view the future company as a potential adversary, and are designing accordingly. That, alone, is a fascinating perspective to have on things, and one that certainly makes sense in the age of enshittification. And, unlike the way many companies that start on the open web, and later come into conflict with it, as they seek to pull up the ladder behind them to protect a moat, Bluesky is trying to design its systems in a way that protects the system from their own future attempts at enshittification:
Even when things are working correctly on social platforms, there are weird dynamics caused by people’s relationships being mediated by a single company. The Internet is pretty obviously real life in the sense that its management has real-world consequences. When these places control our identities and our ability to connect and to make money, having no way out from the founding company is a precarious situation. The power difference is daunting.
The goal of Bluesky is to rebuild social networking so that there’s not a lock-in to the founding company, which is us. We can try to provide a cohesive, enjoyable experience, but there’s always an exit. Users can move their accounts to other providers. Developers can run their own connected infrastructure. Creators can keep access to their audiences. We hope this helps break the cycle of social media companies coming into conflict with the open web.
Now, some users point to the complex onboarding of Mastodon, or the “WTF how does any of this work?” nature of nostr, and worry that any decentralized/federated system has to be confusing. And that user unfriendliness, in some weird way, acts as a moderation tool in its own right, by keeping communities somewhat smaller. But it also keeps communities… smaller. So Bluesky has a different vision. A surprisingly refreshing and honest one:
A great experience should be simple to use. It shouldn’t be overly complex, and there should be sensible defaults and well-run entry points. If things are going well, the average user shouldn’t have to notice what parts are decentralized, or how many layers have come together to determine what they see. However, if conflict arises, there should be easy levers for individuals and communities to pull so that they can reconfigure their experience.
A great experience should recognize that toxicity is not driven only by bad actors. Good intentions can create runaway social behaviors that then create needless conflict. The network should include ways to downregulate behaviors – not just amplify them.
A great experience should respect the burden that community management can place on people. Someone who sets out to help protect others can quickly find themselves responsible for a number of difficult choices. The tooling that’s provided should take into account ways to help avoid burnout.
A great experience should find a balance between creating friendly spaces and over-policing each other. The impulse to protect can sometimes degrade into nitpicking. We should drive towards norms that feel natural and easy to observe.
A great experience should reflect the diversity of views within the network. Decisions that are subjective should be configurable. Moderation should not force the network into a monoculture.
Finally, a great experience should remember that social networking can be pleasant one day and harsh the next. There should be ways to react to sudden events or shifts in your mood. Sometimes you need a way to be online but not be 100% available.
There is no perfect content moderation solution out there. There is no whiz bang simple technical solution to the messiness that is human beings. As I’ve said many times, so many trust & safety dilemmas are really societal problems that we think are new or need to be solved by internet companies because they’re appearing through screens over the internet.
And, of course, nothing that Bluesky is working on may turn out to work, or matter. It’s still a small operation, and some of these ideas are completely untested. But, at the very least, it is presenting some pretty thoughtful ideas in an open way, and trying to think through the real consequences of what it’s creating here. And that, alone, is incredibly refreshing.
* The creator of nostr apparently does not believe moderation should happen at the client level, but when I asked him how relay operators could express their moderation rules suggested it didn’t matter since relays weren’t moderating anyway. Of course, since then I’ve noticed that nostr is being overrun with cryptocurrency spam, so at some point people there are going to realize that something needs to be done.