Unfortunately, impaired driving isn't going away. And neither are the tools cops like well enough to deploy in the field, but apparently not well enough to engage in routine maintenance or periodic quality control testing. This is already a problem for citizens, who can find themselves behind bars if the testing equipment is faulty. The problem is only going to get worse as marijuana legalization spreads to more states.
There's currently no field test equipment that detects marijuana impairment. A company in California thinks it has a solution.
By mid-2020, Hound Laboratories plans to begin selling what it says is the world’s first dual alcohol-marijuana breath analyzer, which founder Dr. Mike Lynn says can test whether a user has ingested THC of any kind in the past two to three hours.
“We’re allowed to have this in our bodies,” Lynn said of marijuana, which became legal to use recreationally in California in 2018. “But the tools to differentiate somebody who’s impaired from somebody who’s not don’t exist.”
We won't know if these claims are true until the testing equipment is deployed. And even then, we still won't know if the machines are accurate or the drivers they catch are actually impaired. Marijuana doesn't work like alcohol, so impairment levels vary from person to person. In addition, there's no baseline for impairment like there is for alcohol. That will have to be sorted out by state legislatures before officers can begin to claim someone is "impaired" just because the equipment has detected THC. At this point, the tech pitched by Hound Labs only provides a yes/no answer.
There's a very good chance this new tech will go live before the important details -- the ones safeguarding people's rights and freedoms -- are worked out. The founder of Hound Labs is also a reserve deputy for the Alameda County Sheriff's Office. And it's this agency that's been test driving the weedalyzer.
The Alameda County Sherriff's Office agreed to test the Hound Breathalyzer in the field.
"What we've seen trending with the addition of the legalization of cannabis in California is that we are coming across more and more marijuana-impaired drivers," said Alameda County Sheriff spokesperson Sgt. Ray Kelly.
"It's not hard to determine if there is THC on someone's breath if they have been smoking it," Kelly said. "It's when they're ingesting it through edibles, which have become much more popular. That's extremely valuable to law enforcement."
These tests are completely voluntary and drivers who submit to them won't be criminally charged even if the device says they're under the influence. But in a few months -- if everyone agrees they're good enough to be used on civilians -- the tests will no longer be voluntary and the consequences will be very real.
Impaired driving that doesn't involve alcohol is going to increase with the legalization of marijuana. But this new tech should be greeted with the proper amount of skepticism. Breathalyzers that detect alcohol have been around for decades and are still far from perfect. A new device that promises to detect recent marijuana use just because researchers say consumption can be detected for up to three hours shouldn't be treated as a solution.
The device is stepping into a legal and legislative void with no established baseline for marijuana "intoxication." It can only say it does or does not detect THC in a person's breath. It can't determine whether the amount is a little or a lot, and no one has any guidance stating how much of a THC concentration should be considered impairing or illegal. But it's pretty much a given these will hit the roads before the law is ready for them, and that should concern drivers in every state where marijuana is legal.
]]>So Pichai's comments to CNN shouldn't be seen as controversial, so much as they are explaining how large numbers work:
"It's one of those things in which let's say we are getting it right over 99% of the time. You'll still be able to find examples. Our goal is to take that to a very, very small percentage, well below 1%," he added.
This shouldn't be that complex. YouTube's most recent stats say that over 500 hours of content are uploaded to YouTube every minute. Assuming, conservatively, that the average YouTube video is 5 minutes (Comscore recently put the number at 4.4 minutes per video) that means around 6,000 videos uploaded every minute. That means about 8.6 million videos per day. And somewhere in the range of 250 million new videos in a month. Now, let's say that Google is actually 99.99% "accurate" (again, a non-existent and impossible standard) in its content moderation efforts. That would still mean ~26,000 "mistakes" in a month. And, I'm sure, eventually some people could come along and find 100 to 200 of those mistakes and make a big story out of how "bad" Google/YouTube are at moderating. But, the issue is not so much the quality of moderation, but the large numbers.
Anyway, that all seems fairly straightforward, but of course, because it's Google, nothing is straightforward, and CNBC decided to take this story and spin it hyperbolicly as Google CEO Sundar Pichai: YouTube is too big to fix. That, of course, is not what he's saying at all. But, of course, it's already being picked up on by various folks to prove that Google is obviously too big and needs to be broken up.
Of course, what no one will actually discuss is how you would solve this problem of the law of large numbers. You can break up Google, sure, but unless you think that consumers will suddenly shift so that not too many of them use any particular video platform, whatever leading video platforms there are will always have this general challenge. The issue is not that YouTube is "too big to fix," but simply that any platform with that much content is going to make some moderation mistakes -- and, with so much content, in absolute terms, even if the moderation efforts are pretty "accurate" you'll still find a ton of those mistakes.
I've long argued that a better solution is for these companies to open up their platforms to allow user empowerment and competition at the filtering level, so that various 3rd parties could effectively "compete" to see who's better at moderating (and to allow end users to opt-in to what kind of moderation they want), but that's got nothing to do with a platform being "too big" or needing "fixing." It's a recognition that -- as stated at the outset -- there is no "right" way to moderate content, and no one will agree on what's proper. In such a world, having a single standard will never make sense, so we might as well have many competing ones. But it's hard to see how that's a problem of being "too big."
]]>Three Freedom of Information requests sent to the Metro Police last year returned documents showing its tech was reporting nothing but false positives. The first response reported a 98% failure rate. A follow-up request generated an admission of a 100% failure rate by the Metro's tech. Now another set of FOI requests has gathered more data from the Metro Police and it appears past reports of consistent failure were pretty indicative of future results.
Facial recognition technology used by London’s Metropolitan Police incorrectly identified members of the public in 96 per cent of matches made between 2016 and 2018.
Biometric photos of members of the public were wrongly identified as potential criminals during eight incidents across the two-year period, Freedom of Information (FoI) requests have revealed.
This may be a small sample size, but it was enough to subject a 14-year-old student to a police stop after the facial recognition software mistook him for a criminal.
The Metro Police are continuing to use the tech despite its relative uselessness. The Met does claim its deployments over the last couple of years have led to eight arrests, but it needs far more than that to offset the system's apparent desire to see the innocent punished.
As the Metro Police continues beta testing its tech on the general public, it's continuing to amass a collection of non-criminal faces in its facial recognition database. This has drawn some attention from Parliament members who have called this "unacceptable." There has been some improvement in one area since the last time the Metro Police were queried about its facial recognition tech. It used to hold onto all images for a year. Now, it only holds watchlist images for 30 days and deletes all non-hit images immediately.
Unfortunately, this spectacular run of failure hasn't moved Parliament to, you know, discourage use of the tech. And it appears those who publicly refuse the privilege of being misidentified as a criminal will have their complaints addressed by being turned into criminals.
In one incident, a 14 year-old black child in school uniform was stopped and fingerprinted by police after being misidentified by the technology, while a man was fined for objecting to his face being scanned on a separate occasion.
Problem solved. The system is only interested in criminals and only criminals would object to having their faces scanned by the Metro's faulty tech. Self-fulfilling prophecies are just another undocumented feature.
]]>While most of these failures center around ContentID's inability to discern actual intellectual property infringement and its avenues for abuse, YouTube's algorithms can't even suss out more grave occurrences, such as child exploitation. This became apparent recently when multiple Pokemon streamers had their accounts nuked due to discussions about child pornography that never occurred.
A trio of popular Pokemon YouTubers were among the accounts wrongly banned by Google over the weekend for being involved in “activity that sexualises minors”.
As the BBC report, Mystic7, Trainer Tips and Marksman all found their accounts removed not long after uploading footage of themselves playing Pokemon GO.
It’s believed the error occurred thanks to their video’s continued use of the term “CP”, which in Pokemon GO refers to “Combat Points”, but which YouTube’s algorithm assumed was “Child Pornography”.
That's pretty stupid and it certainly seems like the reliance for a ban of an entire Google account based on the use of an acronym ought to have come with a review from an actual human being. That human would have immediately understood the context of the use of "CP" in a way the automated system apparently could not. And, to be clear, this wasn't a YouTube ban. It was the elimination of each streamers entire Google account, email and all.
Now, once the backlash ensued, Google got them their accounts back, but that simply isn't good enough. As there is more and more pressure to ramp up automated policing of the internet, at some point, everyone pushing for those solutions needs to realize that the technology just isn't any good.
]]>Using a baseline of 10 million piece of content and a much higher than reality level of accuracy (99.5%), and an assumption that 1 in 10,000 items are "bad" (i.e., "infringing") you end up with a ton of legitimate content taken down to stop just a bit of infringement:
So basically in an effort to stop 1,000 pieces of infringing content, you'd end up pulling down 50,000 pieces of legitimate content. And that's with an incredible (and unbelievable) 99.5% accuracy rate. Drop the accuracy rate to a still optimistic 90%, and the results are even more stark:
Now we're talking about pulling down one million legitimate, non-infringing pieces of content in pursuit of just 1,000 infringing ones (many of which the system still misses).
Of course, I can hear the howls from the usual crew, complaining that the 1 in 10,0000 number is unrealistic (it's not). Lots of folks in the legacy copyright industries want to pretend that the only reason people use big platforms like YouTube and Facebook is to upload infringing material, but that's laughably wrong. It's actually a very, very small percentage of such content. And, remember, of course, Article 13 will apply to basically any platform that hosts content, even ones that are rarely used for infringement.
But, just to humor those who think infringement is a lot more widespread than it really is, Muffett also ran the emulator with a scenario in which 1 out of every 500 pieces of content are infringing and (a still impossible) 98.5% accuracy. It's still a disaster:
In that totally unrealistic scenario with a lot more infringement than is actually happening and with accuracy rates way above reality, you still end up pulling down 150,000 non-infringing items... just to stop less than 20,000 infringing pieces of content.
Indeed, Muffett then figures out that with a 98.5% accuracy rate, if a platform has 1 in 67 items as infringing, at that point you'll "break even" in terms of the numbers of non-infringing content (147,000) that is caught by the filter, to catch an equivalent amount of infringing content. But that still means censoring nearly 150,000 pieces of non-infringing content.
This is one of the major problems that people don't seem to comprehend when they talk about filtering (or even human moderating) content at scale. Even at impossibly high accuracy rates, a "small" percentage of false positives leads to a massive amount of non-infringing content being taken offline.
Perhaps some people feel that this is acceptable "collateral damage" to deal with the relatively small amount of infringement on various platforms, but to deny that it will create widespread censorship of legitimate and non-infringing content is to deny reality.
]]>Alcotest, made by German medical tech company Draeger, is used by a large number of US law enforcement agencies. Challenges to test results led to Draeger turning code over to defense attorneys, who soon discovered a lot of variables affected breath tests -- many of which weren't addressed by the device's software or default settings used by officers. Zack Whittaker at ZDNet has the full report:
One attorney, who read the report, said they believed the report showed the breathalyzer "tipped the scales" in favor of prosecutors, and against drivers.
One section in the report raised issue with a lack of adjustment of a person's breath temperature.
Breath temperature can fluctuate throughout the day, but, according to the report, can also wildly change the results of an alcohol breath test. Without correction, a single digit over a normal breath temperature of 34 degrees centigrade can inflate the results by six percent -- enough to push a person over the limit.
The quadratic formula set by the Washington State Patrol should correct the breath temperature to prevent false results. The quadratic formula corrects warmer breath downward, said the report, but the code doesn't explain how the corrections are made. The corrections "may be insufficient" if the formula is faulty, the report added.
The Washington State Patrol, whose device/software was being examined in this case, said it did not install the breath temp component. That eliminates one questionable variable in this case. Other law enforcement agencies may have installed the component without realizing it could result in false positives. But it's far from the only variable affecting test results the examination of Draeger's software uncovered. The Washington State Patrol also disabled another feature that might have prevented false positives.
The code is also meant to check to ensure the device is operating within a certain temperature range set by Draeger, because the device can produce incorrect results if it's too hot or too cold.
But the report said a check meant to measure the ambient temperature was disabled in the state configuration.
"The unit could record a result even when outside of its operational requirements," said the report. If the breathalyzer was too warm, the printed-out results would give no indication the test might be invalid, the report said.
The State Patrol was more equivocal in its repudiation of this finding. It said it had been "tested and validated in various ambient temperatures." Draeger itself insisted the unit will not produce readings if the device is operating outside of recommended temperature ranges.
The report also noted there appeared to no steps taken to counteract normal wear-and-tear. The fuel cell used to measure alcohol levels decays over time -- a time period accelerated by frequent use (sobriety checkpoints, for instance). This can also affect test results if the decay isn't factored in. Draeger says its devices should be re-calibrated every year. The Washington State Patrol only require one recalibration six months into the device's lifespan.
Challenges against the device's test results have occurred in other states. Massachusetts -- a state where substance abuse-related evidence has never been more unreliable -- hosted one legal battle over the devices' reliability. A ruling in 2014 declared test results obtained over the previous two years "presumptively unreliable" after it was discovered that only two of the state's 392 breathalyzers had ever been properly calibrated.
This battle between critics of the devices and their deployment methods (untested, uncalibrated) and a judicial system that still insists the devices are reliable enough has gone on for most of a decade. Added to the mix is Draeger's own legal action. This preliminary report, distributed to defense lawyers at conference last year, was the subject of a cease-and-desist letter from Draeger, which claimed the report violated a protective order it had obtained from a US court, prohibiting the distribution of its source code. But no source code was distributed and the C&D appears to Draeger attempting to prevent questions about its device's reliability from spreading further than a handful of court cases. And in those legal challenges, Draeger has been able to keep discussion of its devices and software under wraps via injunctions.
While the report's authors claim the report is still in its preliminary stages and should not be considered the final word on breathalyzer reliability, this initial examination doesn't suggest deeper digging will find a more reliable machine underneath the surface-layer flaws.
]]>The latest example of widespread failure comes from the UK, where the government's embrace of surveillance equipment far exceeds that of the United States. Matt Burgess of Wired obtained documents detailing the South Wales Police's deployment of automated facial recognition software. What's shown in the FOI docs should worry everyone who isn't part of UK law enforcement. (It should worry law enforcement as well, but strangely does not seem to bother them.)
During the UEFA Champions League Final week in Wales last June, when the facial recognition cameras were used for the first time, there were 2,470 alerts of possible matches from the automated system. Of these 2,297 turned out to be false positives and 173 were correctly identified – 92 per cent of matches were incorrect.
That's the most gaudy number returned in response to the records request. But the other numbers -- even though they contain smaller sample sets -- are just as terrible. The following table comes from the South Wales Police FOI response [PDF]:

In all but three cases, the number of false positives outnumbered positive hits. (And in one of those cases, it was a 0-0 tie.) The police blame the 2,300 false positives on garbage intake.
A spokesperson for the force blamed the low quality of images in its database and the fact that it was the first time the system had been used.
The company behind the tech insists this is an end user problem.
The company behind the facial recognition system, NEC, told ZDNet last year that large watchlists lead to a high number of false positives.
And it illustrates this with a highly-questionable analogy.
"We don't notice it, we don't see millions of people in one shot ... but how many times have people walked down the street following somebody that they thought was somebody they knew, only to find it isn't that person?" NEC Europe head of Global Face Recognition Solutions Chris de Silva told ZDNet in October.
I think most people who see someone they think they know might wave or say "Hi," but only the weirdest will follow them around attempting to determine if they are who they think they are. Even if everyone's a proto-stalker like NEC's front man seems to think, the worst that could happen is an awkward (and short) conversation. The worst case scenario for false positives triggered by law enforcement software is some time in jail and an arrest record. The personal stake for citizens wrongly identified is not even comparable using de Silva's analogy.
If large watchlists are the problem, UK law enforcement is actively seeking to make it worse. Wired reports the South Wales Police are looking forward to adding the Police National Database (19 million images) to its watchlist, along with others like drivers license data stores.
No matter what the real issue is here, the South Wales Police believe there are no adverse effects to rolling out facial recognition tech that's wrong far more often than it's right. It states it has yet to perform a false arrest based on bogus hits, but its privacy assessment shows it's not all that concerned about the people swept up by poorly-performing software.
South Wales Police, in its privacy assessment of the technology, says it is a "significant advantage" that no "co-operation" is required from a person.
Sure, it's an "advantage," but one that solely serves law enforcement. It allows them to gather garbage images and run them against watchlists while hoping the false hits won't result in the violation of an innocent person's rights. But that's all they have: hope. The tech isn't ready for deployment. But it has been deployed and UK citizens are the beta testing group.
So, it will come as an unpleasant non-surprise that Axon (Taser's body cam spinoff) is looking to add facial recognition tech to cameras officers are supposed to deploy only in certain circumstances. This addition will repurpose them into always-on surveillance devices, gathering up faces with the same efficiency as their automated license plate readers. False positives will continue to be a problem and deployment will scale far faster than tech advancements.
UPDATE: Axon apparently takes issue with the final paragraph of this post. It has demanded a correction to remove an unspecified "error" and to smooth the corners off some "bold claims." Here's Axon's full statement:
At this point in time, we are not working on facial recognition technology to be deployed on body cameras. While we do see the value in this future capability, we also appreciate the concerns around privacy rights and the risks associated with misidentification of individuals. Accordingly, we have chosen to first form an AI Ethics Board to help ensure we balance both the risks and the benefits of deploying this technology. At Axon we are committed to ensuring that the technology we develop makes the world a better, and a safer place.
If there's anything to be disputed in the last paragraph of the post, it might be "looking to add facial recognition tech to its cameras." But more than one source (including the one linked in the paragraph) make the same claim about Axon looking at the possibility of adding this tech to its body camera line, so while Axon may not be currently working on it, it appears to be something it is considering. The addition of an ethics board is certainly the right way to approach this issue and its privacy concerns, but Axon's statement does not actually dispute the assertions I made in the post.
As for the rest of the paragraph, I will clarify that I did not mean Axon specifically will push for body cameras to become ALPRs but for faces. Axon likely won't. But police departments will. If the tech is present, it will be used. And history shows the tech will be deployed aggressively under minimal oversight, with apologies and policies appearing only after some damage has been done. To be certain, accuracy will be improved as time goes on. But as the UK law enforcement efforts show, deployment will far outpace tech advancements, increasing the probability of wrongful arrests and detentions.
]]>
He runs through various ways Facebook, Twitter and the rest might try to spot potential terrorists before they acted -- for example, by using keywords, lists of suspicious sites, social graphs etc. But one feature automated systems all share is that to avoid the risk of letting individuals slip through the net, the criteria for flagging up people have to be loose. And that, inevitably, means there will be false positives:
However sophisticated these systems are, they always produce false positives, so if you are unlucky enough to type oddly, or to say the wrong thing, you might end up in a dragnet.
Here's what that would mean in practice:
Data strategist Duncan Ross set out what would happen if someone could create an algorithm that correctly identified a terrorist from their communications 99.9% of the time -- far, far more accurate than any real algorithm - with the assumption that there were 100 terrorists in the UK.
Requiring social networks to bring in any kind of automated monitoring -- the only kind that is feasible given the huge volume of posts involved -- will simply cause the intelligence agencies to be swamped with a huge number of false leads that will make it impossible to pick out the real terrorists from among the data supplied. In other words, the UK government's plans, if implemented, will just make a bad situation much, much worse.
The algorithm would correctly identify the 100 terrorists. But it would also misidentify 0.01% of the UK's non-terrorists as terrorists: that’s a further 60,000 people, leaving the authorities with a still-huge problem on their hands. Given that Facebook is not merely dealing with the UK’s 60 million population, but rather a billion users sending 1.4bn messages, that's an Everest-sized haystack for security services to trawl.
Follow me @glynmoody on Twitter or identi.ca, and +glynmoody on Google+
]]>Meanwhile, blockades concerning copyright are wreaking similar havoc. Users of Sky Broadband recently discovered that the megapopular imgur image hosting site (which we use to host many of our images) was completely blocked in a moronic attempt to try to block access to a torrent site. Because both the torrent site and imgur used the same CDN (one of the most popular ones), Sky mistakenly blocked it all.Among the sites TalkTalk blocked as "pornographic" was BishUK.com, an award-winning British sex education site, which receives more than a million visits each year.
TalkTalk also lists Edinburgh Women's Rape and Sexual Abuse Centre website as "pornographic."
The company also blocked a programme run by sex education experts, and taught to 81,000 American children, that has been in development for more than 20 years.
TalkTalk's filter is endorsed by Mr Cameron but it failed to block 7% of the 68 pornographic websites tested by Newsnight.
Sky employs an automated blocking system that polls torrent sites’ DNS records in order to quickly re-block them in the event they switch servers or IP addresses.Of course, these obvious over-blockages are merely the tip of the iceberg of what people were talking about when they noted that site blocking would "break the internet." They never meant that the entire internet would shut down, but that certain basic functions of the internet would not work properly, including important security tools like DNSSec. But the fact that even beyond that, these attempts at blocking content at the ISP level are flubbing so badly seems like pretty clear evidence that blocking is not a solution, but rather an even bigger problem than expected.
“Sky regularly pull IP addresses listed on our DNS servers and adds them to their block list. This block list is then used by an advanced proxy system that redirects any requests to the blacklisted IP addresses to a webserver that the ISP owns which returns a blocked page message,” YIFY explains.
Therefore, when YIFY began using CloudFlare servers in Australia, Sky pulled these IP addresses and blocked them in the mistaken belief that they were YIFY’s. Since Imgur uses the same IP addresses, Sky’s automated blocking took the site offline, to the huge disappointment of countless customers.