Google Trust Rank, an update…
Posted by Rainer on November 16th, 2007 at 09:00pm
I have done a lot of research since I wrote my original article on Google Trust Rank. I was too interested to track down this beast.
First of all, I have to admit that a good number of my technical assumptions and conclusions were simply wrong. My thoughts about this being rumor, however, survived the test. With my new knowledge, I think TrustRank is really existing, most probably has for already a while - but many folks (including me up to recently) simply misunderstand it. And that creates rumor around trust rank that is not true.
The picture began to clear up for me when I found a scientific paper on trust rank. It is from Stanford University, but it is not Google-specific. In fact, it used Altavista as its testing bed. However, I am pretty sure that Google has paid attention to it, if they did not even develop something themselves in their lab.
What also helped to get the big picture was an interesting report about Google’s search labs in the New York times. While it has no specific details on trust rank, it has a lot of things that can be read between the lines.
I try to sum up what I think is most important about this concept. It is my personal opinion - read the sources yourself, you may draw different conclusions. Keep in mind that Google’s trustrank is probably different from what was in the paper. But I think it will share the basic ideas, otherwise it would probably be called differently (oh, I forgot that Google doesn’t call it anything after all… ;)).
Most importantly, TrustRank can be algorithmically computed. So my number one invalid assumption in the previous paper was that TrustRank solely depends on human review. Quite the opposite is true and it now fits much better in my overall picture of Google.
TrustRank (TR) is in many ways similar to PageRank (PR). Just the way it starts is different. Let’s ignore that for now. As with PageRank, TrustRank can (and will) be passed from one page to another. A link to a page is a vote for that page. Part of the linking pages’ TR will be carried over to the linked page. How much, is depending on many factors and shall not be of interest for us here. Important is the fact that TR calculation is pretty similar to PageRank (PR) calculation.
What is totally different is the way the initial ranks are calculated. With PR, every site’s (link) votes are equal. In (too) simple words, you crawl the web once, count how many links a page receives and the most linked page has the highest PR. All fully automatic - and all subject to spam or SEO (to phrase it a little less upsetting).
TrustRank, on the other hand, requires manual labor. Humans need to review sites and check how trustworthy they are. Are they spam? Do they have good information? Are they set up as a trap for the reviewer (eg. have good information now but are scheduled to change after acquiring trust)? Is the site owner trustworthy? Just think about it: a government is probably more trustworthy than a private body than the average Joe (OK, some me argue about that, but I think you got the idea…). So even real-world, non-virtual trust plays a role in human review.
It is impractical to review all web sites. It is impractical to review a small fraction of the sites. And it is even impractical to review a fraction of this small fraction. Only a very, very small number of sties can actually undergo human review. So the TR needs to be able to deliver good results on a small, select set of sites. Let’s call these sites “seed sites”. As their number is small, the selection of them is very important. It, too, can be done automatically. For example, sites which are either high on the search engine result pages (SERPs) could be chosen or those with many outgoing links.
The actual method to select them shall not be of our concern here. For Google, it will remain a secret anyhow. Important is that the seed sites get selected by some parameters that qualify them. This is (by intension) very vague, but the point to note is that there must be a reason to be in that set. It does not happen just by accident.
In case of real-world search engine, I’d also say that the seed set is not fixed, but being worked on all the time. So we do not have a static set, but one that evolves over time. Just think about the spam busters that each search engine employs. I guess any site detected to be spammy will also become part of the seed set for trust rank - with a thumbs down vote. And while I am speculating: I’d assume that there also is a time value that comes with the human vote - a more recent review will count higher than a review done month ago. But that is pure speculation. For a software developer like me, it just sounds like the right thing to do…
The seed sites are reviewed to be either trustworthy or not. Note that a vote to be not trustworthy takes some trust away from the sites they link to. This is basically known with PR too - the old “do not go into a bad link neighborhood paradigm“.
Based on the (ever changing) seed set and the (ever changing ;)) pagerank-like trustrank algorithm, trust is assigned to each and every page. As with pagerank, the closer you are to a trusted site, the more trust you receive (or is taken away from you, if being linked to from a bad page). The TR calculation itself is purely automatic, no human intervention required. The end result is a nice TR value for each page. That value will be ever-changing too, but for a given moment in time it has a specific value. Let’s freeze time now and think about what that value means…
… it is absolutely up to the search engine what it means! Of course, TR will be used to order pages in the SERPs. So it will be used to decide if you site will be shown on page 1 or 1,000. But trustrank alone, IMHO, would be far too inferior to be used as the sole, or major, source of search result page sort order. I guess that Google will use TR as one parameter is uses to compute the overall value that it assigns to a page in regard to this search word. I don’t mean page rank here, which I consider to be just another parameter. I am sure there are a myriad of other parameters. The NY Time interview has quite some good explanation on what may be considered, so if you like more ideas, go and read it.
The question is how much weight Google assigns to TR and PR. You’ll probably never find an official Google answer. And, to be honest, I don’t think one is even needed. It is obvious that Google will tweak that part of the algorithm the same it tweaks other parts of it. So, for example, the weight may be a number x for a given search term and a value of y for another. And the very next day it may even be completely different, because the Google search team has had another bright idea.
Speculation again: what I think what happened by the last ranking update is that Google probably changed the weights as well as some other parameters in its algorithm. I do not think they introduced trustrank for the first time. Its too long known for Google to adopt it at that time. But they’ve probably given it a boost to combat what they consider spam.
So, what’s the lesson to learn? Unfortunately, I can not (and will not) offer any black hat SEO here: nothing has really changed. Google likes sites who get link from authority sites. Google is probably making it harder to fake being an authority site. They don’t like it if you get your link from that poor and unmaintained, heavily spammed university department x link directory. They like it, however, if you get that same link from the hard to obtain spot on that same universities home page. Same applies for other authority sites. I guess the bar has risen in this area.
For us webmasters, it means that it is even more important to try getting links form high-profile sites. Sounds surprising? I hope not… I know it is hard to do that, but I like the idea that there is a reward for high quality content. And, of course, black hats will sneak in and find their ways around the new algorithm. But that, too, will not last too long.
If you intend to build a long-lived site, there is no way around creating high quality, unique content. That will bring the best reward in the long term. And, after all, isn’t that why humans (aka visitors) like and visit web sites?
Under google
1 Comment for Google Trust Rank, an update…
1. » Can many backlink&hellip | November 26th, 2007 at 10:36 am
[…] have recently written about TrustRank and there is some information in that article on bad neighborhoods taking away trust in your site. […]
Leave a Comment for Google Trust Rank, an update…
You must be logged in to post a comment.
Trackback this post | Subscribe to the comments via RSS Feed