Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I realize that not everyone understands how patents work but this is ridiculous. The patent claims don't mention matrices. Any implementation (like matrices) is merely an embodiment (you can implement the patent without matrices).

And even if that weren't true, the foundation of the patent system is applying existing techniques to new applications. The background section of the patent clearly details how this technique has been used in other applications.

I don't dispute that a lot of patents are trash but this is possibly the most important patent of the last 30 years. That doesn't mean it invented computers, mathematics and the internet, it just put some already good ideas together. That is what invention is.



I realize that not everyone understands how matrices work, but this is ridiculous. The abstract of the patent:

"... the rank of a document is calculated from a constant representing the probability that a browser through the database will randomly jump to the document."

And the specific claim is:

"Looked at another way, the importance of a page is directly related to the steady-state probability that a random web surfer ends up at the page after following a large number of links."

This _explicitly_ describes a Markov Chain, which is naturally represented by a matrix. A variety of versions of the linear equation are explicitly given in the patent.

To claim you can implement the patent without matrices is, for all intents and purposes, wrong. You can implement the same equations in a variety of ways, but they are still matrix equations.

They patented the idea to apply random walks to ranking webpages. That's arguably reasonably novel, though Wikipedia lists a number of predecessors. But it was also an inevitable invention, because there is a large number of people familiar with Random Walks/Markov Processes, they are routinely taught to undergrads, and are used to model and analyse a vast number of processes [1].

[1] https://en.wikipedia.org/wiki/Markov_chain#Applications


What you quoted is not what the patent “claims”. The claims are the numbered points in the section “Claims”. They are the only legally enforced section of a patent and they are written in a very specific language. Everything else is background or embodiment and has little to no legal value.


> but this is possibly the most important patent of the last 30 years

It is an interesting struggle to figure out what my objection to that point is. I think it is that we know exactly how hard it is to apply linear algebra to a problem - not everyone's cup of tea, but easily 10% of software engineers would be able to do it.

The truly groundbreaking part of Google was never the indexing - that was a problem that was going to get solved one way or another. The groundbreaking part is figuring out that search + low latency + advertising is a money printing machine and that tech favours the winner.

The mechanism to achieve search + low latency + advertising is important but to some degree unimpressive. If the other search engines at the time had realised the payoffs and how important latency was they'd have gone short text-only ads too and put more engineering time into the problem - maybe someone other than Google would be the search engine of the day.

And even if PageRank was the difference between Google and a hypothetical runner up, the difference of a better algorithm would be marginal. Decisive, but ultimately marginal.


Back then google loaded in a few seconds whilst the competition could take over a minute for the 99% of users who were on a 28.8k dial-up.

It's almost as if the people behind Lycos/Excite/Altavista were all using the internet via a T1 connection from their unis..


I think a big factor is that Google didn’t have to make any money and could keep their homepage as simple as possible.


Altavista was actually just a tech demo of their servers :)


And Google had connections to which large organizations from their start that gave them competitive advantage in multiple areas?


Stationary distributions (what pagerank is) were used for relevancy of scientific references at least 20 years before the pagerank patent - I was sitting in a lecture in 95 or so about the Perron Frobenius theorem, and this was given as an (old but not very old) example of an application at the time.


Sure, but the novelty of the patent is The Random Surfer Model. That is, applying that math to the ranking of web pages. The novelty is looking at the problem from the right perspective. After you have read the paper, and seen it demonstrated to work, the "invention" is very obvious. But before that, it really isn't.


That lecture I was in described "The random waterfall model", in which you find a scientific paper, randomly pick one of the references, go to it, and continue -- and IIRC, at a small percentage, jump to a random other paper. The professor was not describing his own work, but one that was published a decade or three before.

As far as I can tell (and could tell the day I heard PageRank described a few years later), there was no difference between that and PageRank, although there is a huge practical difference in that scientific papers can only ever refer to those that were published before them (or at least were in preparation at the time), whereas web pages are edited and can point to any other web page.

The "reference rank" application is not a DAG because of the "in prepataion" links, although it is not very far - so the "jump to random paper" is much more important to produce a useful stationary distribution than in PageRank - but it is otherwise the same.

Page and Brin did a lot of interesting things, many of which weren't trivial, and were hugely rewarded for that by society. But PageRank was an application of an old idea to new medium, not a new idea - in a way that (on its face) should not deserve patent protection.

I remember Google's first days - the main selling point for the majority of people I knew was not "it finds what I want when other search engines dont" - people had learned to direct AltaVista properly more or less. The selling point was "It gives an answer in milliseconds insteads of tens of seconds". In fact, I remember complaints because it lacked the "and/or/not/near" and other features that AltaVista and Lycos already had.


You have summarized everything superbly.


> The patent claims don’t mention matrices

What? That’s the entire idea of the patent: using repeated matrix multiplication to compute the relative “importance” of various nodes in a directed graph.

> you can implement the patent without matrices

How?


The idea sure was brilliant at the time. But I really doubt that we (as a society) have gained anything by allowing this to be protected. It might have prevented some healthy competition. Certainly anyone thinking about search engines (once this became a thing) would have thought of this. And I doubt that nobody would have bothered to create a search engine just because it couldn't be patented.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: