Panda, Penguin, Linking, Anchor Text, a detective story.
The FIRST thing we have to realize is that correlation is NOT causation.
The SEO industry is all abuzz with Panda/Penguin theories saying that using xx% of anchor text will
cause a problem with the Google algorithms' indexing of your site in their
search results. Many SEOs are saying that using too many "money words" as
the anchor text in inbound links is going to trigger a penalty.
I sincerely doubt that this is an issue as, in an ideal organic linking
event, the anchor text is not a factor that is under the control of the site
owner or it's webmasters. Google understands this.
If they actually had a penalty for too many repetitions of anchor text it would
apply across the board on all kinds of anchor text, even naked links optimized
with the title of the article.
It doesn't. A 100 link sampling for nbs-seo.com shows 86% of the links have the
money words nbs-seo in their anchor text and a search for nbs seo turns my site
up in #1.
What is happening is that Google is looking at PATTERNS, not actual
When a paid linking program is done the organic linking pattern is changed.
Daily/monthly link totals deviate from the norm, and it is these metrics that
Google monitors in an effort to find the sites that are trying to influence
Let's step back and look at what Google is trying to accomplish, which is to
increase the real value of links and cut down the amount of money they waste
chasing links in phony link profiles.
The SEO industry is a morass of unethical link building.
"Unethical" in the sense that companies try to influence the results of Google's
algos by placing links primarily designed to influence search results and not to
just drive traffic.
These are promoted as having an "Organic Profile" but are actually paid links
designed to manipulate Google's results.
Their profile can never be properly organic.
Google had problems with non-organic linking right from
Developed over the years is a long list of the basic spam and link types that Google fights.
3 way linking
Buying expired domains
Using world-writable pages
Spam in blogs
Referrer log spamming
Linking only on high PR pages
Linking only on dofollow pages
Faked link wheels
Social networking spam
Hidden or invisible text
Press Release flooding
Drive-by spam linking
Alt text spam
Anchor text spam
Other types of spamdexing
These kinds of linking schemes developed as a result of
Google's use of link metrics to grade web pages in their PageRank equations.
They brought in the nofollow tag (In 2005) to discourage using links as an
influence and stop blog or comment spam.
Still, these changes did not stop the flow of links intended to skew
their search results.
If we look at the various updates,
They came to understand that it is difficult/impossible to
police what you cannot control.
PageRank was not as intended, a citation based calculation.
As a trusted
influence on the search results, it was a failure.
A new plan was put into operation.
To regain control and bring links to serve as originally intended, they had to remove user influence and base results on relevance.
This meant disabling links as an actionable metric, and basing both SERPs and
Back after the very first un-named Google update, (1st Documented Update -
September 2002), a
webmasterworld discussion saw the comment:
"I think Google has
gotten much more smarter in determining which links are there for the benefit of
page rank and rank manipulation, and which are there for honest reasons."
Let's look at some of the earlier updates that
specifically looked at linking:
“Boston” 2003 February – Sites that backlinked each other take a hit
while sites that constantly update and have many incoming links are promoted.
“Cassandra” 2003 April – Addressed hidden links and linking between sites
owned by the same owners.
“Dominic” 2003 May – Google did a deep crawl of the web, and reported
back links changed.
“Esmerelda” 2003 June – Key website phrases are used to rank pages.
“Brandy” 2004 February – Google begins looking at anchor text, link
neighborhoods and synonyms.
“Nofollow” 2005 January – Nofollow tag introduced.
“Allegra” 2005 February – Sites changed rankings but the specifics were
unclear. It was meant to remove high ranking spam from the index.
“Gilligan” 2005 September – Unclear change but Google confirmed that the
index data was updated daily while Toolbar (PR) was only updated every three
“Jagger” 2005 September-November – A series of updates targeting linking
practices and made Pagerank updates public.
“Big Daddy” 2005 December-Feb 2006 – Cleaned up Google’s handling of urls
and linking, though some websites suffer needlessly.
Then there was no real link based announced updates until a
Google VP warned us in 2008 that links do not count as much as before.
April 17, 2008 Udi Manber,
search boss, wrote:
We’re innovating, and concentrating just on the relevancy of
results. (As opposed to link influence.)
Later, in another post he told us that PageRank was now a part of a much
larger system (from its position as a highly influential factor).
On May 20 2008
Manber again posted:
The most famous part of our ranking algorithm is
PageRank, an algorithm developed by Larry Page and Sergey Brin, who
founded Google. PageRank is still in use today, but it is now a part of a
much larger system. (My bold)
In the last post he expounded on his earlier comment of relevance by
Other parts include language models (the ability to handle phrases,
synonyms, diacritics, spelling mistakes, and so on), query models (it's not
just the language, it's how people use it today), time models (some queries
are best answered with a 30-minutes old page, and some are better answered
with a page that stood the test of time), and personalized models (not all
people want the same thing).
Demoting PageRank has come about because of the difficulty of separating the organic from
the marketing links designed to influence Google's results.
With the demotion, the onus was now on word recognition, text formatting,
position and relevance.
In an article on semantic, search, it was stated: To illustrate the
point around two years ago (ED 2008) Google took a different approach to
teaching a computer how to understand languages, which is more like the way
humans learn them. And while this is not conclusive proof, it does indicate
the direction that Google is taking in regards to word recognition.
We also get a glimpse into the internal workings with Manber's comment
There is a whole team that concentrates on fighting
webspam and other types of abuse. That team works on variety of issues
from hidden text to off-topic pages stuffed with gibberish keywords, plus
many other schemes that people use in an attempt to rank higher in our
search results. The team spots new spam trends and works to counter those
trends in scalable ways; like all other teams, they do it internationally.
The webspam group works closely with the
Google Webmaster Central team, so they can share insights with everyone
and also listen to site owners.
As we move forward in time we see that
Google removed PageRank from it's webmaster tools in
Susan Moskwa, a Google search webmaster trends analyst said:
We've been telling people for a long time that they shouldn't focus on
PageRank so much....
Google removed PR from the Webmaster Tools but not from the public's tool
They had a way of grading pages which was flawed, but plans were afoot. (I
assume that being the founders' "baby" played a part in not being totally
plans for PageRank showed in the "Mayday" update when they changed PR from a
strictly mathematical formula to one based on relevance.
27 May 2010 Google confirms the "Mayday"
update, and Vanessa Fox, an ex-Google employee says in an interview:
I asked Google for more specifics and they told me that it was a
rankings change, not a crawling or indexing change, (My bold).
also commented saying,
when asked during Q&A, ”This
is an algorithmic change in Google, looking for higher quality sites to
surface for long tail queries. It went through vigorous testing and isn’t
going to be rolled back.”
If we look at this statement closely, it tells us that the Google
Bot visits have not changed in frequency or depth. (crawling)
It tells us that their index, (SERP's) remained the same. (indexing)
But that the ranking was changed to something different. (PageRank)
The ONLY interpretation than can be made of the triumvirate of
crawling/indexing/ranking is Spidering/SERPs/PageRank. (ranking)
This has been borne out in several PageRank grading on different sites.
In one instance a PR of 4 was gained by having 1xPR5, 1xPR3, and 113xPR0
RELEVANT links, something that would not have been possible before the
Sometime in 2012 in an article about
Jaime Casap "Google's Education
Evangelist" at Google, posted on the W.P. Carey School Of Business
website, they said:
What has changed is the way information is organized in the Google
search process: ad results (banner ads and Google ads), organic results,
video results, news results and real time results. (Most telling is the
part:) The company has trademarked and patented its PageRank system,
which ranks web pages by relevance.
This is nowhere near the
original PageRank (PR),
formula which was based on a strictly mathematical process which
calculated the linked page's PR on the amount and quality of links, with
quality being based on the PR of the linking site.
= (1-d) + d(PR(t1)/C(t1) + ... + PR(tn)/C(tn))).
In this algorithm, there is no use of relevance.
Let's look at the post on
(PS: Look at the anchor text in the above link. If Google was penalizing sites
for excessive anchor text usage this page would NEVER appear in the SERPs, and
it is #1 for "the penguin update whiteboard friday".)
The presenter tells us how his company has changed the way they place paid
links to include more varied and less anchor text and how that has had a
positive impact on their clients.
In other words, cheating gets results, (he thinks).
Here is the statement from their article:
"This week's Whiteboard Friday covers the recent Penguin Update, including what
to do and what not to do. I certainly wouldn't say that it's a comprehensive
guide, but it does discuss the issues and causes that I have witnessed.
Fortunately Ayima's campaigns have been unaffected (other than increases) by the
If we go to look at their own site and clients' we find:
designbuzz.co.uk (Their design leg)
This doesn't look like increases, quite the opposite.
The cumulative effects of Panda, Penguin, and the rest of the changes that
Google made are producing an overall lowering of internet traffic.
Google has been working up to this for quite a while.