Clearview Celebrates 10 Billion Scraped Images Collected, Claims It Can Now Recognize Blurred, Masked Faces

from the getting-bigger-and-getting-worse dept

Clearview’s not going to let several months of bad press derail its plans to generate even more negative press. The facial recognition tech company that relies on billions of scraped images from the web to create its product is currently being sued in multiple states, has had its claims about investigative effectiveness repeatedly debunked and, most recently, served (then rescinded) a subpoena to transparency advocacy group Open the Government demanding information on all its Clearview-related FOIA requests as well as its communications with journalists.

I don’t know what Clearview is doing now. Maybe it thinks it can still win hearts and minds by not only continuing to exist but also by getting progressively worse in terms of integrity and corporate responsibility. Whatever it is that Clearview’s doing to salvage its reputation looks to be, at best, counterproductive. I mean, the only way Clearview could get worse is by getting bigger, which is exactly what it’s done, according to this report by Will Knight for Wired.

The company’s cofounder and CEO, Hoan Ton-That, tells WIRED that Clearview has now collected more than 10 billion images from across the web—more than three times as many as has been previously reported.

Ton-That says the larger pool of photos means users, most often law enforcement, are more likely to find a match when searching for someone. He also claims the larger data set makes the company’s tool more accurate.

That’s one way of looking at it. Another way of looking at it — and by “it,” I mean Clearview’s unaudited, untested facial recognition AI — is that adding more hay increases the odds of someone grabbing some hay and thinking it’s actually a needle.

Yet another way of looking at this is that Clearview’s mass scraping of every bit of publicly accessible web data it can may be legal, but it certainly isn’t morally acceptable. While people do largely understand that their public posts to sites can be accessed by nearly anyone, they certainly don’t expect someone to collect their photos and data in bulk, package it up, and sell it to government agencies. And, in some states, this sort of activity may actually be illegal, hence the lawsuits being brought by government officials.

On top of the 10 billion images Clearview swears it will only sell to responsible adult government employees, the company is now claiming it can do some real CSI-type stuff with its tech.

Ton-That says it is developing new ways for police to find a person, including “deblur” and “mask removal” tools. The first takes a blurred image and sharpens it using machine learning to envision what a clearer picture would look like; the second tries to envision the covered part of a person’s face using machine learning models that fill in missing details of an image using a best guess based on statistical patterns found in other images.

If you feel selling government agencies a more efficient way to generate false positives and false negatives is the way to future profitability, this would be the route to take. Without a doubt, tech advances will eventually make this more accurate, but rolling out unproven machine learning on top of unproven AI is only going to compound errors. Then there’s the bias problem, which has been a problem for all facial recognition software. That includes those that have been independently tested and examined by the National Institute of Standards and Technology (NIST). Notably, Clearview has yet to subject its AI to outside testing.

Finally, there’s this statement from Clearview CEO Ton-That:

The company says it is not currently pitching the technology outside of the US or to private industry. “We’re focusing on the United States, because we want to get it right here,” Ton-That says. “We never want this to be abused in any way.”

Whether or not this statement about its current potential customer list is true remains to be seen. Clearview has already pitched its product to private companies and foreign governments. And it appears to have exited one foreign market solely because its product was declared illegal following a government investigation.

And claiming that the company does not want its product “abused in any way” directly contradicts the stuff it says to entities it wants to sell its product to. Emails from the company’s marketing staff encouraged potential law enforcement customers (as well as the occasional billionaire) to “experiment” with the software by running searches on friends, family members, and others who never consented to be part of multiple Clearview test drives.

Is Clearview the worst AI product out there? In terms of accuracy, who knows? It hasn’t been independently reviewed. In terms of everything else, there’s really nothing out there that competes with it. The company’s nonchalant conversion of the open web into a surveillance tool sets it apart from the competition. Its latest “advances” aren’t going to do anything to rehabilitate its reputation.

Filed Under: ,
Companies: clearview

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Clearview Celebrates 10 Billion Scraped Images Collected, Claims It Can Now Recognize Blurred, Masked Faces”

Subscribe: RSS Leave a comment
26 Comments
Ian Williamssays:

Odd thought, could their scraping websites for photos be a copyright infringement? Photos do have copyright attached, and while Facebook et.al, have permissive licenses, they have not actually then sublicenced the scraped images to Clearview, who arguably then creates commercial derivative work from them, in the form of their face maps.

scotesays:

Machine learning makes s*** up.

You can’t use machine learning for legitimate forensic purposes.

Machine doesn’t recover detail, it doesn’t unblur faces, it doesn’t magically show the face beneath the mask, instead it invents plausible photorealistic detail. It’s literally a computer program faking evidence in an utterly and dangerously convincing way, and almost certainly doing so in a very biased way.

Scary Devil Monasterysays:

Re: Machine learning makes s*** up.

"It’s literally a computer program faking evidence in an utterly and dangerously convincing way, and almost certainly doing so in a very biased way."

…and this is why given sharp contrasts and shadows to work with in a controlled environment a computer’s best guess can be close enough so as to identify a white person in good light but can’t tell the difference between Prince and Oprah Winfrey in any kind of light – let alone Mr. John Doe of african-american descent from a mugshot of Bin Laden.

Honestly, everywhere but the US Clearview would, by now, have been asked to produce credible evidence for their claim or get hit by serious sanctions for false advertising. But american law does love to incentivize the snake oil salesman brand of conmanship…

Anonymoussays:

The ONLY thing clearview recognizes is planting fake evidence in their database because they want "black people" to pay for crimes even if there is no evidence other than "a gut feeling".

Thats how they claim they can essentially "see through" masks magically…

You can guarantee 100% of the time it’ll be "the black guy did it..they’re all criminals anyway"

Scary Devil Monasterysays:

Re:

"They want a "recognition" database 1) for the money…"

Full stop, Right there. There is money in peddling the snake oil of facial recognition as the next big brand line of forensic quackery. And the amount of money available grows in direct proportion to how little fact-checking of the method’s accuracy there is.

If clearview thought they could get away with it they’d make any pitch they thought could get any vested interest, from law enforcement to private kindergartens, to buy some clearview miracle medicine.

Scary Devil Monasterysays:

Re:

"They want a "recognition" database 1) for the money…"

Full stop, Right there. There is money in peddling the snake oil of facial recognition as the next big brand line of forensic quackery. And the amount of money available grows in direct proportion to how little fact-checking of the method’s accuracy there is.

If clearview thought they could get away with it they’d make any pitch they thought could get any vested interest, from law enforcement to private kindergartens, to buy some clearview miracle medicine.

Scary Devil Monasterysays:

Re: Machine learning makes s*** up.

"It’s literally a computer program faking evidence in an utterly and dangerously convincing way, and almost certainly doing so in a very biased way."

…and this is why given sharp contrasts and shadows to work with in a controlled environment a computer’s best guess can be close enough so as to identify a white person in good light but can’t tell the difference between Prince and Oprah Winfrey in any kind of light – let alone Mr. John Doe of african-american descent from a mugshot of Bin Laden.

Honestly, everywhere but the US Clearview would, by now, have been asked to produce credible evidence for their claim or get hit by serious sanctions for false advertising. But american law does love to incentivize the snake oil salesman brand of conmanship…

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...
Older Stuff
13:40 It's Great That Winnie The Pooh Is In The Public Domain; But He Should Have Been Free In 1982 (Or Earlier) (35)
12:06 Norton 360 Now Comes With Crypto Mining Capabilities And Sketchy Removal Process (28)
10:45 Chinese Government Dragnet Now Folding In American Social Media Platforms To Silence Dissent (14)
10:40 Daily Deal: The 2022 Ultimate Cybersecurity Analyst Preparation Bundle (0)
09:29 A Fight Between Facebook And The British Medical Journal Highlights The Difficulty Of Moderating 'Medical Misinformation' (9)
06:29 Court Ruling Paves The Way For Better, More Reliable Wi-Fi (4)
20:12 Eighth Circuit (Again) Says There's Nothing Wrong With Detaining Innocent Minors At Gunpoint (15)
15:48 China's Regulatory War On Its Gaming Industry Racks Up 14k Casualties (10)
13:31 Chinese Government Fines Local Car Dealerships For Surveilling While Not Being The Government (5)
12:08 Eric Clapton Pretends To Regret The Decision To Sue Random German Woman Who Listed A Bootleg Of One Of His CDs On Ebay (29)
10:44 ICE Is So Toxic That The DHS's Investigative Wing Is Asking To Be Completely Separated From It (29)
10:39 Daily Deal: The 2022 Complete Raspberry Pi And Arduino Developer Bundle (0)
09:31 Google Blocked An Article About Police From The Intercept... Because The Title Included A Phrase That Was Also A Movie Title (24)
06:22 Wireless Carriers Balk At FAA Demand For 5G Deployment Delays Amid Shaky Safety Concerns (16)
19:53 Tenth Circuit Denies Qualified Immunity To Social Worker Who Fabricated A Mother's Confession Of Child Abuse (35)
15:39 Sci-Hub's Creator Thinks Academic Publishers, Not Her Site, Are The Real Threat To Science, And Says: 'Any Law Against Knowledge Is Fundamentally Unjust' (34)
13:32 Federal Court Tells Proud Boys Defendants That Raiding The Capitol Building Isn't Covered By The First Amendment (25)
12:14 US Courts Realizing They Have A Judge Alan Albright Sized Problem In Waco (17)
10:44 Boston Police Department Used Forfeiture Funds To Hide Purchase Of Surveillance Tech From City Reps (16)
10:39 Daily Deal: The Ultimate Microsoft Excel Training Bundle (0)
09:20 NY Senator Proposes Ridiculously Unconstitutional Social Media Law That Is The Mirror Opposite Of Equally Unconstitutional Laws In Florida & Texas (25)
06:12 Telecom Monopolies Are Exploiting Crappy U.S. Broadband Maps To Block Community Broadband Grant Requests (7)
12:00 Funniest/Most Insightful Comments Of 2021 At Techdirt (17)
10:00 Gaming Like It's 1926: Join The Fourth Annual Public Domain Game Jam (6)
09:00 New Year's Message: The Arc Of The Moral Universe Is A Twisty Path (33)
19:39 DHS, ICE Begin Body Camera Pilot Program With Surprisingly Good Policies In Place (7)
15:29 Remembering Techdirt Contributors Sherwin And Elliot (1)
13:32 DC Metro PD's Powerful Review Panel Keeps Giving Bad Cops Their Jobs Back (6)
12:11 Missouri Governor Still Expects Journalists To Be Prosecuted For Showing How His Admin Leaked Teacher Social Security Numbers (39)
10:48 Oversight Board Overturning Instagram Takedown Of Ayahuasca Post Demonstrates The Impossibility Of Content Moderation (10)
More arrow
This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it