That Anonymous Coward (profile)

rubs between his eyes again
something something over & over pretending it has never been done before despite mountains of carved out Malibu Hills explaining how this will end poorly.

christenson

May 3, 2021 at 11:22 am

Mathematical distance theorems...

So three words isn’t enough to effectively hash the geography of the entire earth…

Can the researchers
a) give a precise definition for "effectively"?
b) determine how many (latin, since I can’t do arabic or Kanji!) words are needed to cover the earth?

Rocky

May 3, 2021 at 12:20 pm

Re: Mathematical distance theorems...

So three words isn’t enough to effectively hash the geography of the entire earth…

Why wouldn’t it? The English language have about 170,000 words in common use which is more than adequate to hash the geography 28 times over if you use 3m²/10ft² squares. The amount of words used by W3W should be ~56 000 which gives 175 616 000 000 000 unique 3m² location identifiers. The earth has a surface area of ~510 100 000 000 000m². You do the math.

Currently only English and Korean cover the entire globe with W3W’s system, all other languages only cover land.

What the researches pointed out was that some locations use similar words which can lead to confusion and since the system is touted as a way to easily tell rescue-services where you are it’s a weakness which can have deadly consequences.

Christenson

May 3, 2021 at 3:16 pm

Re: Re: Mathematical distance theorems...

Make no mistake: This is a mathematical encoding system, and it should be discussed technically in terms of its formal properties. Hamming distance between valid symbols comes to mind.

That math says coverage is 1/3 of the 3m squares on the planet under optimum assumptions. Check with a street beggar in Mumbai on whether that’s enough for where she sleeps, then get back to me about your favorite tall building.

The code is not effective — assuming all of us will recognize the same 56000 words, (fact not in evidence — anecdotally, I think the literate recognize about 20K, but that’s kinda ableist), — because it’s still ambiguous in the presence of noise and common human and machine errors.

I’d expect loss of singulars/plurals/tense information, not to mention homonyms and even near homonyms (pinks/sphynx/lynx/punks) to be a problem in a stressed communication environment.

All of these end up being computer-translated, so I’d want some redundancy too…not all code points should be valid locations, unless you expect them to always be copy/pasted.

This is why I think in terms of 4 words.

Google ‘Plus codes’ (thanks Anonymous Coward!) have some of these properties — no 9/4 or digit 0 versus letter O or digit 1 versus lowercase l kind of issues.

Qwertygiy

May 5, 2021 at 6:59 pm

Re: Re: Re: Mathematical distance theorems...

Okay. Going way too deep on the math and linguistics involved in an esoteric geopolitical topic that will be of no consequence to me in my personal life.

My specialty!

Let’s start by putting down the numbers and facts:

W3W’s system divides the world into, officially, 57,000,000,000,000 squares. I don’t know whether this is exact, but at scales this large — 500 billion is less than 1% — a potential few billion one way or the other should not make any effective difference.
According to a 2016 study, the average young American adult has a reading vocabulary of 42,000 words. The speaking vocabulary is lower, in the range of 20 to 25 thousand.
The minimum number of words to reach 57 trillion, in combinations of 3, is 38,486.
The minimum number of words to reach 57 trillion, in combinations of 4, is the incredibly small figure of 2,748.
Learning English — the subset of English urged for use in Simple Wikipedia or for secondary English speakers — is merely 1,500 very common words.

Assumptions about the suitability of words, that may vary based on individual interpretations of common sense:

Homonyms (different words, spelled the same way — lead the horse to water, lead pipes) and homophones (different words, spelled differently, but commonly pronounced the same — she’s a witch, which is bad) are to be avoided when at all possible.
Words that are nearly homophones should be avoided as well. B, V, F, P, TH can all be very ambiguous depending on audio interference and speaker accent. TH, SH, S, Z, CH also represent a group of sounds that, while distinct in clear conditions, are similar enough to become indistinguishable at times.
Words should differ from any other word in the list by at least two letters. If I make one mistake and type "dout", what did I mean? Doubt? Donut? Dot? Dour? Rout? Don’t? A single insertion, removal, or typo could create "dout" out of any of those words. Fortunately, "dout" is not a word, and so the receiver would know to ask for a correction.

But what about "dour"? That’s a word, but it could be a similar typo for "door", "our", "pour", "odour", "sour", "four", "hour", "tour", or "your". If we want to be as clear and error-proof as possible, there can only be one three- or four-letter-word ending in "our". This obviously cramps the number of short words allowed, but that’s fine. Typically, whenever encountering familiar expressions, moderately intelligent observers overcome significant character quantities without considerable confusion; successfully understanding embarrassingly long-winded communications.

No proper nouns. No John, no Janet, no California, no Wilson, no Samsung. No words that should be capitalized mid-sentence according to the AP style guide. If you need a reason why, I suggest you stop by a Starbucks.
It should not be possible to include plural, possessive, or tense variations of a word in the same position. If it’s possible for word 1 to be "park", then it should not be possible for word 1 to be "parks", "parked", "parker", "parking", or, heaven forbid, the likes of "repark"…
Hell, let’s take that one a step further in regards to plurals. No words that end in "s" or "z", no matter how many things they’re talking about. Like in the problematic example, there’s no reason there should ever be any combination of "pink start", "pinks start", and "pinks tart". I don’t think this ban has to extend to the other similar sounds described earlier; it’s a lot harder to slur together "porch champ" than "ports stamp".
I don’t care that shit is a good, unequivocal-sounding word — I don’t want Karen trying to censor herself, little Billy being afraid of getting in trouble if he reads it out, or Officer Jones on the other end of the line deciding it’s a prank call when someone phones in that they’re stuck at humongous.dog.turd. Obviously, the definition of profanity is purely subjective, but we could cut out the vast majority of objectionable phrases by simply eliminating any words typically or commonly used to refer to bodily excretions (poop), genitals (wiener), sex and sexuality (gay), negatively referring to intelligence (stupid), ethnicity (latino), nationality (kraut), or disabilities (can you just imagine the crap that could fly if autistic or crippled were included? yikes).
Only standard modern English words. No new slang, no obsolete gibberish, no acronyms, no contractions or hyphenations, no foreign loanwords that have yet to declare independence from their mother tongue. It can be obscure, but it has to be in use somewhere by someone, and it has to be a single word.

That’s quite a few rules, huh? Especially considering that the average English speaker doesn’t share the same set of 42,000 words in their vocabulary, it doesn’t seem likely that all those rules can be followed and still have 38,000 useful words left over. So in all likelihood, you’d want to include a fourth word. Following those rules with a list of 3,000 would be way less of a headache.

And as Dropbox taught us, remembering four words is still super easy. As long as I live, even if a terrible dementia ravages through my frontal lobe until I’m incapable of remembering my age, or my name, or even the Alamo… I will never be free of correct horse battery staple.

Oh, and there’s one more rule I’d want to apply. It doesn’t eliminate any words outright, but it limits where they can be used.

Take the following examples:
"werewolf.package.end.alone"
"where.wolfpack.agenda.loan"
"where.wolfpack.urgent.alone"

"carton.sure.leaving.complete"
"cart.unsure.leaf.incomplete"
"carton.surely.fin.complete"
"car.tincture.leaf.incomplete"
"car.tincture.leaving.complete"

There’s an easy way to avoid this potential syllable uncertainty: mandate the length of components.
Say, the first word has to be one syllable. The second word and the final word have to be two syllables. The third word has to be at least two syllables, but it can be as many more as you want — it doesn’t matter, you know it’s whatever is left over.

So in the above two examples, the only valid options would be:
"where.wolfpack.urgent.alone"
"car.tincture.leaving.complete"

Anyway, that’s my eighty-three cents on the subject. No, you don’t get a TL;DR on a grammar post. Put on your reading glasses and get over it. Or ignore my ramblings entirely; I’m just a nerd, not a cop.

Aaron Toponce

May 3, 2021 at 3:02 pm

Re: Mathematical distance theorems...

what3words maps 57 trillion squares to 3 word addresses by shipping a 40,000 word list. Due to the sheer size of the word list, it’s cluttered with problems:

Plurals
Homonyms
Compound words
Obscure words
Lexeme forms
Long edit distances
Unnecessarily long words

A smaller word list could easily address all these issues by adding one more word. Something like 4,096 words is more than enough to reach all 57 trillion 3m x 3m squares on the globe with 4 word addresses.

One of the problems is the lack of error detection. A more tightly, curated list would greatly minimize that risk, but a "check word" could eliminate it. Simply hash the 4 words with SHA-256, read the 12 least or most significant bits to deterministically pick the 5th check word, similar to BIP39. The person on the phone would then communicate five words to emergency services, and the risk they’re sent to the wrong location is non-existent.

Granted, 5 words is more to manage than 3, I admit. But it should not be hard to build a list of 4,096 common words of say 3-8 characters. The what3words word list minimum length is 4 characters, and the maximum is 18 characters. This is actually the breakdown:

4 characters: 1342 words
5 characters: 2849 words
6 characters: 4799 words
7 characters: 6496 words
8 characters: 6842 words
9 characters: 6002 words
10 characters: 4735 words
11 characters: 3164 words
12 characters: 1962 words
13 characters: 1121 words
14 characters: 439 words
15 characters: 202 words
16 characters: 39 words
17 characters: 7 words
18 characters: 1 word

The average character count is unnecessarily heavy in its current implementation, where few characters per word with one or two more words could on average be the same length to type.

Anonymous Coward

May 3, 2021 at 3:42 pm

Re: Re: Mathematical distance theorems...

Couldn’t you do this with four words of slightly longer length, with the fourth word being the check word?

Aaron Toponce

May 3, 2021 at 4:06 pm

Re: Re: Re: Mathematical distance theorems...

So you actually don’t need 57 trillion squares. The surface area of the globe is (only) 510 billion m^2. Diving that into 3m^2 blocks like what3words did, you end up with 170 billion squares, a far cry from 57 trillion. This means you only need 5,540 unique words to generate all 3 word address possibilities.

So yes, you could still stick with 3 word addresses with an optional 4th check word.

christenson

May 3, 2021 at 3:53 pm

Re: Re: Mathematical distance theorems...

THANKS. This is the sort of discussion I wanted.

I was wondering if we could get a comparison of properties with similar algorithms, such as the google "plus words" first mentioned by a helpful Anonymous Coward below? And how well does this stuff work with the 57th floor of the Empire State building?

It might also be worth looking into the use cases…in the dysfunctional US telecom market, it’s not too hard to get out of cell tower range in mountains.

nasch (profile)

May 4, 2021 at 9:24 am

Re: Re: Re: Mathematical distance theorems...

And how well does this stuff work with the 57th floor of the Empire State building?

No worse than latitude and longitude.

Anonymous Coward

May 3, 2021 at 12:39 pm

Since their whole sales pitch in favor of the word conversion over reading your GPS coordinates as numbers is to supposedly reduce errors from misheard digits, the existence of phonetic ambiguity in the wordlists just blows the whole purpose of the system out of the water. "pink.start" vs "pinks.start" is just unforgivable.

The whole concept seems to me more like a funny toy for kids to play with than a serious tool. If you were going to use an app to help people send their location to emergency responders, why should it produce a string of words you have to speak on the phone instead of transmitting the coordinates directly?

Anonymous Coward

May 3, 2021 at 3:52 pm

Re: Re:

I can answer part of this:

The entire database can fit on a device you carry with you, even if you don’t have network access.

This means that as long as you have the app on your phone, all you need is some means of receiving the three words, and you can look up the authoritative address.

So this isn’t for someone who’s out and has a heart attack and calls 9-1-1. This is for the 9-1-1 dispatcher to broadcast a call on the radio with the three words. That way, the nearest responder knows exactly where they’re going, even if they don’t have a reasonable address or description.

But doing such a scheme without a check word just shows that they didn’t have a mathematician involved in the design process.

Ironically, a mathematician just did a video on the logic behind such a system, as implemented in a card game:
https://www.youtube.com/watch?v=VTDKqW_GLkw

Applying this logic to GPS coordinates should be able to create a system MUCH better that uses four emojis instead of three words, with a guarantee that the word combinations will be unique across any specific geography. And it can’t be patented because there’s over a century of prior art.

TheResidentSkeptic (profile)

May 3, 2021 at 1:16 pm

Here's 3 words for them.

Out. Of. Business.

Anonymous Coward

May 3, 2021 at 3:55 pm

Re: Here's 3 words for them.

Well, that doesn’t map to their system, but if you were reading the following over a noisy CB radio….

https://what3words.com/outs.soft.business

Darkness Of Course (profile)

May 3, 2021 at 1:26 pm

Turns out, code is free

And unconstrained. After all, what are a few gits among coders? Not much effort at all.

The researcher can rest easy. The work is available in America, where things are free unless they tick off the GQP.

Thad (profile)

May 3, 2021 at 3:08 pm

Re: Turns out, code is free

What the hell are you talking about?

Anonymous Coward

May 3, 2021 at 3:56 pm

Re: Re: Turns out, code is free

He’s stating that the code is in a git repo for anyone to grab. He’s intentionally not being any more direct than that so that he doesn’t get sued.

rorybaust (profile)

May 3, 2021 at 1:53 pm

I only got 3 letters

WTF

Thad (profile)

May 3, 2021 at 3:08 pm

Re: I only got 3 letters

…you know those letters stand for words, right?

Anonymous Coward

May 3, 2021 at 3:57 pm

Re: Re: I only got 3 letters

https://what3words.com/laugh.outs.loud

Anonymous Coward

May 3, 2021 at 2:09 pm

So… great way for identifying your location, only you aren’t allowed to use their word combinations to do that without being sued into the ground.

Seems like a winning business model.

That One Guy (profile)

May 3, 2021 at 2:34 pm

You can tell a lot about someone by their enemies

If a company treats security researchers as enemies that says louder than words that they are not offering a secure product/service.

Anonymous Coward

May 3, 2021 at 2:35 pm

Opensource Alternatives

There’s plenty of other solutions to the problem of locating people. Since you are more likely to have Google Maps on your phone than w3w, just use Google’s ‘Plus Codes’ alternative. It’s open source and on your phone already with Maps.

Serge Milosavljevic

May 12, 2021 at 2:47 pm

IKEA Invested $16 million in What3words

It’s probably them that came on so strong to “protect” their investment. Search for Ikea’s dark past or illegal tree logging.

Anonymous Coward

June 1, 2021 at 5:13 am

Mapcode is better anyway

Mapcode.com

Friday
10:54	The US Banning TikTok Would Play Right Into China’s Hands, And Destroy Decades Of US Work On Promoting An Open Internet (4)
10:51	Daily Deal: The Ultimate Adobe CC Training Bundle (0)
09:26	Congressional Testimony On Section 230 Was So Wrong That It Should Be Struck From The Record (11)
05:26	Apple Praised For Repair Reforms Only Made Possible By New Oregon Law It Tried To Kill (2)
Thursday
20:45	UK Prosecutors Apologize For Pursuing BS Charges Against A Photographer (12)
15:32	Palworld Creator Loves That Others Are Trying To Clone The Game (14)
13:02	Sextortion Is A Real & Serious Criminal Issue; Blaming Section 230 For It Is Not (25)
11:02	Cops Claim Body Cam Footage Of Wrong Address Raid Would Be 'Dangerous' To Release To General Public (37)
10:57	Daily Deal: The Complete Cisco Training Bundle (0)
09:24	No, TikTok Is Not ‘Programmable Fentanyl.' Stop It (58)

What3Words Sends Ridiculous Legal Threat To Security Researcher Over Open Source Alternative

from the never-use-what3words dept

Comments on “What3Words Sends Ridiculous Legal Threat To Security Researcher Over Open Source Alternative”

Mathematical distance theorems...

Re: Mathematical distance theorems...

Re: Re: Mathematical distance theorems...

Re: Re: Re: Mathematical distance theorems...

Re: Mathematical distance theorems...

Re: Re: Mathematical distance theorems...

Re: Re: Re: Mathematical distance theorems...

Re: Re: Mathematical distance theorems...

Re: Re: Re: Mathematical distance theorems...

Re: Re:

Here's 3 words for them.

Re: Here's 3 words for them.

Turns out, code is free

Re: Turns out, code is free

Re: Re: Turns out, code is free

I only got 3 letters

Re: I only got 3 letters

Re: Re: I only got 3 letters

You can tell a lot about someone by their enemies

Opensource Alternatives

IKEA Invested $16 million in What3words

Mapcode is better anyway

Add Your Comment Cancel reply

Comment Options:

What's this?

The Techdirt Greenhouse

Trending Posts

Friday

Thursday

More

Tools & Services

Company

Contact

More

What3Words Sends Ridiculous Legal Threat To Security Researcher Over Open Source Alternative

from the never-use-what3words dept

Comments on “What3Words Sends Ridiculous Legal Threat To Security Researcher Over Open Source Alternative”

Add Your Comment Cancel reply

Comment Options:

What's this?

Techdirt Daily Newsletter

The Techdirt Greenhouse

Trending Posts

Friday

Thursday

More

Email This Story

Tools & Services

Company

Contact

More