Microsoft Highlights Why Google's 'Cheater' Accusations Ring Hollow

from the good-for-them dept

We had a long discussion recently about Google’s response to discovering that Microsoft used clickstream data from users to help improve the relevance of their own search. Microsoft’s Yusuf Mehdi has now written up a much more detailed response from Microsoft’s point of view, in which it again clarifies that contrary to Google’s statements, Microsoft is not “copying” Google’s search results, but merely using clickstream data as one of many (Microsoft says approximately 1,000) variables in improving search relevance. Microsoft does take one cheap shot: noting that, technically, the “honeypot” trick that Google used to uncover this certainly appears to be a form of “clickfraud.” That is, it was a trick designed specifically to manipulate Bing’s search results.

But the key point is made towards the end:

We have brought a number of things to market that we are very proud of — our daily home page photos, infinite scroll in image search, great travel and shopping experiences, a new and more useful visual approach to search, and partnerships with key leaders like Facebook and Twitter. If you are keeping tabs, you will notice Google has “copied” a few of these. Whether they have done it well we leave to customers. But more importantly, we take no issue and are glad we could help move the industry to adopt some good ideas.

That’s the point that I tried to make in the original post. History has shown that innovation occurs via competition, and part of that competition often involves competitors building on each other’s work. A few months back, I wrote a review of the excellent book Copycats by Oded Shenkar, which makes this point very, very clear. Innovation happens when companies build on each other’s work. But, what you learn is that it’s not just about “copying,” it’s about all of the players learning, innovating and expanding the overall market. Just straight up copying rarely does enough to make a difference (in fact, we’ve discussed this problem in the form of cargo cult copying, where companies just copy some superficial aspect, and discover that it’s meaningless). That’s clearly not what Microsoft was doing here.

In the comments to our original post, someone made the comment, in defense of Google, by saying if what Microsoft did was okay, then couldn’t he just go out and say “I’ve got a billion dollar search engine idea!” and then just copy Google’s results. But, of course, if anyone actually thinks this through, they’d realize that copying Google’s search results is not a billion dollar search idea. Assuming that, tomorrow, we launched a “new search engine” that gave the identical results to Google, almost no one would use it. Why would you? There’s no real advantage to doing so. And for people who already use Google, it’s probably much more integrated into their lives, with Gmail, Google Docs and more. The search results themselves are not the “billion dollar idea.” It’s the overall execution.

Hopefully Google learns from this and realizes that it has learned plenty from watching Microsoft as well, and complaining about Microsoft using clickstream data is a waste of time. Focus on continuing to innovate, Google, which’ll probably mean learning more things from Microsoft, in addition to what you’re doing yourself.

To be fair, Matt Cutts also has a put together a decent response, where he points out that the real issue here may be disclosure — in that Microsoft did not clearly disclose that it was using clicskstream data (and especially how it was using that data). That’s a perfectly reasonable point, but it was not the original point that Google raised. I agree that Microsoft could and should be much clearer in its disclosure — but that’s a totally separate issue. Cutts also explains why he thinks that Microsoft really is “copying,” but again, even if we grant that premise (which I don’t think is accurate), I still don’t see why that matters. Copying and improving is a part of the innovative process. Google should embrace it.

Filed Under: , , ,
Companies: google, microsoft

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Microsoft Highlights Why Google's 'Cheater' Accusations Ring Hollow”

Subscribe: RSS Leave a comment
38 Comments
DenisVi says:

This word you keep using, Google does not think it means what you think it means

Copying, as defined by Microsoft folks over and over again these past few days, is taking an innovation from company A and re-building it.

Copying, as defined by Google folks over and over again these past few days, is taking results of an innovation and literally copying them.

While one is common in the tech world, and Bing staff makes a valid point of creating multiple innovations that Google has “adopted”, the second one is problematic. When Bing says that clickstream (with apparent tailoring for google links) is only one of the signal, it basically means that when the other signals aren’t returning normal data, Bing relies *only* on Google. Moreover, it sometimes exploits Google’s propriatery autocorrect mechanisms to increase it’s relevance in that way. Meanwhile, Microsoft hasn’t indicated that Bing is learning something from it, just parroting, and if Google was to disappear one day, Bing that relies on Google so much would be effectively crippled.

Ed says:

Bing (Powered by Google)

I haven’t seen Google make any claims that what Bing has done is illegal or even immoral. All I have seen is Google pointing out that Bing is using Google. I don’t see how Google pointing this out is wrong.

Much like a songwriter, after hearing his song played by a musician coming out and saying “I wrote that.” Not trying to stop the musician, just interested in claiming the credit.

I haven’t seen Google begin formal action, legal or otherwise. They only seem interested in pointing out the behavior to the press and embarrassing a competitor.

Zimzat (profile) says:

Two headed hydra that can't agree

I feel that there is a two-headed hydra at Techdirt and neither head agrees with the other.

One head posts about how businesses need to learn to innovate, to compete, and accept that marketplace instead of falling back to legal protections.

The other says that when there is something wrong with a copy we should leave it up to social shunning to make it right.

And yet, despite the fact that Google has not, so far at least, fell back to legal protections, and is actually trying to leave it to social shunning, TechDirt posts are now trying to socially shun Google when they’re the ones that were copied.

Make up your mind.

aikiwolfie (profile) says:

It's just an excuse to kick Microsoft when it's down!

This whole “Microsoft copied our search results” attack from Google is an excuse to kick Microsoft when it’s down. I’d agree Google’s time would probably be better spent doing more creative things.

t makes sense Google would try to create bad news stories for Microsoft. This is a small piece of propaganda in a much bigger spat.

Microsoft is involved in several law suits against Android. Which Google can’t be happy about. And at the moment Microsoft looks particularly stale and week.

Microsoft continues losing money in the search space, it continues to cut projects and product lines, it continues to lay off staffers and it’s also still losing top managers. And for the first time in a long time Windows desktop market share is threatening to drop below 90%.

And the recent financial results from Microsoft didn’t look good. Even after they tried to explain them away with their own special brand of accounting. The share price for Microsoft stock still fell.

Microsoft are trying to hurt Google at the moment. And Google smells blood. But not it’s own.

vivaelamor (profile) says:

This word you keep using, Google does not think it means what you think it means

“Microsoft hasn’t indicated that Bing is learning something from it, just parroting, and if Google was to disappear one day, Bing that relies on Google so much would be effectively crippled.”

Actually, they appear to record data from their Bing toolbar, which would still be in their system if Google were to disappear. Although their learning would stop if Google disappeared, what they have already learnt would still apply.

cc (profile) says:

This word you keep using, Google does not think it means what you think it means

That’s more or less what I was arguing in the comments of the previous article, even though my thoughts focused only on innovation in search quality and not on presentation.

The result of my discussion with Marcus Carab can be summarised thus: It depends on how much Microsoft is (indirectly) using Google’s results, and we won’t know that until more data is made available.

My hunch is, a lot. They must be getting massive amounts of Google data, seeing how many people use Google, and it’s all in pure query->document format, no less. In my view, instead of coming up with a better way to analyse the data it already has, Bing is trying to replicate Google’s existing semantic links* between terms, which is possibly the hardest thing to tweak when you’re making advanced document retrieval systems.

That they say they use “over 1000 variables” is irrelevant, because as any statistician will tell you it’s not the number of variables that counts but their weighting. If Bing is aiming to “become Google” because that’s the search engine people want, they’ll use the query->document data they get from Google to directly reinforce the query->document mappings in their system, which makes the other sources mostly irrelevant…

And that’s why this is cheating, in my opinion. Perhaps that’s not necessarily a “bad” thing and their technology will eventually and inevitably catch up, but it leaves a bad taste in my mouth all the same.

* For instance, Google may have decided to use a thesaurus (or even automatically learned a thesaurus!) to create a link between the terms “cat” and “feline”, so when a user searches for cats, they also get documents about felines. This is not an obvious link for a computer, but it very likely improves retrieval performance. If Bing didn’t think to do the same, and they only start showing documents about felines because they saw Google do the same, then their technology is still inferior, so in my book this cannot possibly count as innovation or as science. They are giving the illusion that they are competing with Google, but they are simply giving a “counterfeited” version of their competitor’s results that they couldn’t recreate by their own means.

vivaelamor (profile) says:

Two headed hydra that can't agree

“And yet, despite the fact that Google has not, so far at least, fell back to legal protections, and is actually trying to leave it to social shunning, TechDirt posts are now trying to socially shun Google when they’re the ones that were copied.”

Calling something cheating isn’t merely socially shunning, it’s implying that they broke the rules. The fact that they haven’t sued doesn’t mean that what they’ve said isn’t liable to backfire. It would appear that you would rather Techdirt supported Google making inaccurate statements than point out the truth.

That said, Microsoft may have fucked up by how they gathered the data. However, in that case they didn’t wrong Google or Google’s customers, only their own customers.

cc (profile) says:

This word you keep using, Google does not think it means what you think it means

Problem is, the internet is not a fixed document set. Yesterday’s most relevant result could be today’s least relevant result.

Google has shown that they have the right heuristics to keep their mappings updated, but I’m not sure if Bing can work well enough without “borrowing” Google’s…

Overcast (profile) says:

Innovation happens when companies build on each other’s work.

It has to – it’s not like Google wrote the software their services *need* to operate. Microsoft didn’t design the CPU that’s needed for their OS to operate..

If all innovation in an area was left to a single lateral patent/copyright – we’d still be riding horses if we couldn’t afford the buggy from the single producer.

Of course, Microsoft has a long history of just hi-jacking other people’s innovations and then boxing them with other software in a vain attempt to make it look like it’s ‘original’.

cc (profile) says:

This word you keep using, Google does not think it means what you think it means

And yes, “counterfeited” is a loaded word, but I can’t think of a word that applies to this situation.

It’s not copying in the traditional sense, it’s not counterfeiting and it’s definitely not stealing.. “Cheating” and “plagiarism” are the only words that I can think of that sound harmless enough to describe this, but even they are overkill.

Anonymous Coward says:

Hitting Microsoft because it isn’t transparent about its search is kind of a joke, don’t they all hide how they do this?

I am also sure Microsoft would be perfectly happy if Google just went away, even if it hurt Bing’s abilities.

I don’t have a problem with Microsoft using Google info to improve their search, I don’t have a problem with companies looking at products on the market and improving them. What I do believe is wrong is flat out copying content. That is what most musicians and artists have a problem with. Its not taking something they have done and redoing it, it is taking a song and just because its digital, thinking there is a right to distribute it.

Xander C (profile) says:

Missing Backstory

Mike, for your concideration:
http://www.npr.org/2011/02/02/133443201/Google-Bing-Tussle-Over-Search

The “Search Rip-off” came about as Google’s lead engineers started noticing Bing’s searches on misspelled words where getting identical fixes and results.

“LAURA SYDELL: When you type a search request into Google, say, Hosni Mubarak, and you’re a couple of letters off, Google can usually figure out what you mean.

Mr. AMIT SINGHAL (Software Engineer, Google): And getting these queries right is an incredibly hard task. It’s a very challenging algorithm.

SYDELL: That’s Amit Singhal. He’s the lead of the search team at Google. A few months back, they noticed something strange. A user searched for tarsorrhaphy.

Mr. SINGHAL: It was this real medical procedure that some users generally needed to know about.

SYDELL: The user misspelled it. But Google’s algorithms figured out what he needed. Singhal noticed that competitor Bing didn’t bring up any results until a few weeks later.

Mr. SINGHAL: Bing started showing the topmost relevant result for that spelling correction to their users.

SYDELL: Hmm.

Mr. SINGHAL: Now, we got suspicious. However, we said, maybe they came up with some clever algorithm and they did it.

SYDELL: But Singhal and his team decided to do a little experiment. They began to do searches for silly made-up words, and they created fake results unrelated to those words. A few weeks later…

Mr. SINGHAL: Microsoft’s Bing started showing the same artificial result for the same synthetic query. And this was just conclusive to us at that point.”

While Bing has offered great things to Searching, there was clearly a copy of services that could not be explained by just creating their own proper code. As noted, Bing was “learning” from people using Google though IE 7/8, sending over data as to what was being searched and what Google returned with for those queries. That’s a level of shady we’ve come to expect from MS and needs to be called out.

LT BALL says:

Beware.. who you fear

Russians were convinced the Czar was bad for them. They got the murdering regimes of Lenin and Stalin.
Microsoft is big and pushy… Google is worse they are arogant and snoopy with no interest in privacy rights etc. Google complaining someone else is looking at thier pubilc data is rich after they have been caught looking data on wifi networks.

Anonymous Coward says:

Say what you want

Microsoft’s Bing toolbar customers send them the search terms they use and the resulting links they click on. In this case their customers were Google engineers that submitted their results to Microsoft to help them make search better.

Google’s engineers shouldn’t be submitting their clickstream data to Microsoft if they don’t want them to use it to better their search results.

In this case Google engineers intentionally manipulated their search results (they said this is impossible during congressional hearings – obviously incorrect) and then intentionally and in an organized manner attempted to use clickstream data to influence Bing search results. That is one form of click fraud.

I get it that you like Google and don’t like Microsoft. However, these types of arguments don’t make any actual sense.

David Liu (profile) says:

Two headed hydra that can't agree

Yeah, I don’t quite get it either.

It’s not like Google really wasted a lot of time, money, and effort to catch Microsoft in the act, and once it did, it tossed up a blog post about it. Honestly to me, it seems like Google’s doing exactly what Techdirt says it should, by socially shunning Microsoft for “cheating”. Maybe it should’ve done it a little more snarky to come out quite a bit more ahead, but still, from what I’ve read, it’s good enough.

I don’t quite get why Mike says “Google complaining about Microsoft using clickstream data is a waste of time”. It isn’t. It puts Google in the better light socially, exactly what Mike has set forth in the past. It’s been really hard to read these Google vs Bing articles in the past couple days, since it’s a glaring hypocrisy in every one of the articles.

Avatar28 (profile) says:

This word you keep using, Google does not think it means what you think it means

Imitaion?

In any case, I have to disagree. What you have described IS innovation. Take an idea that someone else had and improve on it. Based on your logic Google’s image search is inferior to Bing’s because MS had the idea for the infinitely scrolling search and then Google copied the idea.

That’s also not what I believe happened here. Rather, MS is looking at user behavior. User searches for a word or phrase in Google or any other search engine and then clicks on links A, B, and F (having decided that C, D, and E are just blog spam). When the search is done on Bing it takes into account that people were clicking on A, B, and F but only a few were clicking on C, D, and E and they didn’t stay if they did. When it ranks the results C, D, and E are ranked lower as a result.

Basically, it brings humans into the ranking process to provide more useful results. Digital computers are not nearly as good at recognizing patterns (and thus filtering out junk sites) as the human brain. In some ways, it is sort of like Yahoo did in its early days. Also bear in mind that even after Google engineers fed Bing lots of fake data and fake clickthroughs on nonsense words they still only managed to get Bing to show the site they wanted a like 6 times out of 100 attempts. In other words, using a bullshit scenario that would never happen in real life they were only able to trick Bing a whopping 6% of the time.

cc (profile) says:

This word you keep using, Google does not think it means what you think it means

You missed my point, I think. Perhaps you want to read the conversation between me and Marcus Carab in the previous related article.

“User searches for a word or phrase in Google or any other search engine and then clicks on links A, B, and F (having decided that C, D, and E are just blog spam).”

But what miraculous process put relevant results in positions A, B and F? Google’s algorithm, we can presume. If another search engine copies the results of the algorithm, it means they can fake improved search performance but don’t know how it was actually done. They improve their search, but contribute nothing to the users or to search engine technology — not innovation, in my opinion.

“using a bullshit scenario that would never happen in real life they were only able to trick Bing a whopping 6% of the time.”

Which means Bing couldn’t absorb all the data the 20 engineers were feeding it and nothing else. As to why, it’s anybody’s guess. My guesses are, it’s either to keep the sparse document vectors smaller (by ignoring rarer terms) and thus cut costs, or maybe they were clever enough to have a safeguard so spammers can’t exploit their exploit and Google-bomb them (literally) with fake/dangerous websites for common terms.

Mike Masnick (profile) says:

Missing Backstory

Mike, what will you consider to be wrong? Please give an example?

I would consider something to be wrong if I said something that was factually incorrect. I haven’t seen that in this story yet. There was nothing in that backstory that said anything I had said originally was wrong.

Say, if Bing returns the same Google page but with Bing logo, and Bing ads, etc. Still ok? Can we call this Bing-google-it-for-you innovation?

What do you mean by “ok”?

Anonymous Coward says:

I generally agree with 99.9% of what Mike says, but this is one of the exceptions.

Taking search result X from Google that happens in a Microsoft browser as a result of query Y, and then replicating it with their own search page certainly meets the definition of “copying”, and I personally really struggle to see it as “innovation”.

vivaelamor (profile) says:

Two headed hydra that can't agree

“Honestly to me, it seems like Google’s doing exactly what Techdirt says it should, by socially shunning Microsoft for “cheating”. Maybe it should’ve done it a little more snarky to come out quite a bit more ahead, but still, from what I’ve read, it’s good enough.”

The point is that Google are implying that Microsoft wronged them by cheating, which they technically don’t appear to have and thus stand to generate more bad publicity for crying wolf than if they’d just left out the accusation of cheating. Flaming Mike for saying so is OK. Flaming Mike for saying so and suggesting that he is somehow going against his opinions on shaming actual wrongdoings when he doesn’t think this is an actual wrongdoing is not OK.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...