Fan Uses AI Software To Lipread What Actors Really Said In TV Series Before Chinese Authorities Censored Them

from the I-saw-what-you-said-there dept

It’s hardly news to Techdirt readers that China carries out censorship on a massive scale. What may be more surprising is that its censorship extends to even the most innocuous aspects of life. The ChinAI Newsletter, which provides translations by Jeff Ding of interesting texts from the world of Chinese AI, flags up one such case. It concerns a Chinese online TV series called “The Bad Kids”. Here’s how the site Sixth Tone describes it:

Since its first episodes were released on China’s Netflix-like video platform iQiyi in mid-June, “The Bad Kids” has earned sweeping praise for its plot, cinematography, casting, dialogue, pacing, and soundtrack. It’s also generated wide-ranging online discussion on human nature due to the psychology and complex motivations of its characters.

However, as the Sixth Tone article points out, the authorities required “a lot of changes” for the series to be approved. One fan of “The Bad Kids”, Eury Chen, wanted to find out what exactly had been changed, and why that might be. In a blog post translated by ChinAI, Chen explained how he went about this:

Two days ago, I watched the TV series “The Bad Kids” in one go, and the plot was quite exciting. The disadvantage is that in order for the series to pass the review (of the National Radio and Television Administration), the edited sequences for episodes 11 and 12 were disrupted, even to the point that lines were modified, so that there are several places in the film where the actor’s mouth movements and lines are not matched, which makes the plot confusing to people. Therefore, I tried to restore the modified lines through artificial intelligence technology, thereby restoring some of the original plot, which contained a darker truth.

The AI technology involved using Google’s Facemesh package, which can track key “landmarks” on faces in images and videos. By analyzing the lip movements, it is possible to predict the sounds of a Chinese syllable. However, there is a particular problem that makes it hard to lipread Chinese using AI. There are many homophones in Chinese (similar sounds, different meanings). In order to get around this problem, Chen explored the possible sequences of Chinese characters to find the ones that best match the plot at that point. As his blog post (and the ChinAI translation) explains, this allowed him to work out why certain lines were blocked by the Chinese authorities — turns out it was for totally petty reasons.

Perhaps more interesting than the details of this particular case, is the fact that it was possible to use AI to carry out most of the lipreading, leaving human knowledge to choose among the list of possible Chinese phrases. Most languages don’t require that extra stage, since they rarely have the same number of homophones that Chinese does. Indeed, for English phrases, researchers already claimed in 2016 that their AI-based LipNet achieved “95.2% accuracy in sentence-level, overlapped speaker split task, outperforming experienced human lipreaders”.

It’s clear that we are fast approaching a situation where AI is able to lipread a video in any language. That is obviously a boon for the deaf or hard of hearing, but there’s a serious downside. It means that soon all those millions of high-quality CCTV systems around the world will not only be able to use facial recognition software to work out who we are, but also run AI modules to lipread what we are saying.

Follow me @glynmoody on Twitter, Diaspora, or Mastodon.

Filed Under: , , , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Fan Uses AI Software To Lipread What Actors Really Said In TV Series Before Chinese Authorities Censored Them”

Subscribe: RSS Leave a comment
30 Comments
Anonymous Coward says:

Re: Re: Re:

Why bother? If they dubbed out the prohibited lines as poorly as described, the logical next step is that next time they poorly dub out the lines and apply an equally crude visual censor to the actor’s mouth – the classic black bar used for nudity, or the hyper-pixelization used to obscure a face.

Scary Devil Monastery (profile) says:

Re: Re: Re: Re:

"Why bother? If they dubbed out the prohibited lines as poorly as described…"

With the soviets you could sort of predict that the propaganda and censorship was heavyhanded to the point of absurdity. With China it’s a bit of hit and miss – some of their PR stunts growing on the midden of the blunt and obvious old school copypasta of political manifestos – and some of their more subtle PR is good enough gaslighting to convince even the more sceptical for a while.

Anonymous Coward says:

Re: Curiously petty censorship is a problem here in the states

Many of our departments lack consistent standards for declassification (and it follows, for classification).

Perhaps. It’s also quite possible they do have consistent standards, but like pretty much all other sets of standards for moderation, different people with access to the same training and information will still interpret the same set of standards in wildly different and yet still reasonable ways.

This comment has been flagged by the community. Click here to show it.

Rekrul says:

Most languages don’t require that extra stage, since they rarely have the same number of homophones that Chinese does. Indeed, for English phrases, researchers already claimed in 2016 that their AI-based LipNet achieved "95.2% accuracy in sentence-level, overlapped speaker split task, outperforming experienced human lipreaders".

So the AI can do what 90% of the people posting on the internet can’t?

Because it seems that absolutely nobody today knows the difference between "your" and "You’re", or "its" and "it’s". Hell, they don’t even know the difference between "there", "their" and "they’re" most of the time, or that "of" isn’t a replacement for "have" when it follows a word ending in "ould".

Uriel-238 (profile) says:

Re: Common homophones

Some advanced word-processing and copy editing software will scan for homophones and highlight the ones whose meanings seem to be inconsistent with context.

So yes, this is a thing AI can do better than humans, but still a far reach from perfectly.

Maybe as we teach kids how to talk to their personal digital assistant software (e.g. SIRI, Cortana, Google Assistant) the language will develop towards something computers can more easily understand.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...