|
Page Cloaking - To Cloak or Not to Cloak
By Sumantra Roy
Page cloaking can broadly be
defined as a technique used to deliver different web pages under different circumstances.
There are two primary reasons that people use page cloaking:
i) It allows them to create a
separate optimized page for each search engine and another page which is aesthetically
pleasing and designed for their human visitors. When a search engine spider visits a site,
the page which has been optimized for that search engine is delivered to it. When a human
visits a site, the page which was designed for the human visitors is shown. The primary
benefit of doing this is that the human visitors don't need to be shown the pages which
have been optimized for the search engines, because the pages which are meant for the
search engines may not be aesthetically pleasing, and may contain an over-repetition of
keywords.
ii) It allows them to hide the
source code of the optimized pages that they have created, and hence prevents their
competitors from being able to copy the source code.
Page cloaking is implemented
by using some specialized cloaking scripts. A cloaking script is installed on the server,
which detects whether it is a search engine or a human being that is requesting a page. If
a search engine is requesting a page, the cloaking script delivers the page which has been
optimized for that search engine. If a human being is requesting the page, the cloaking
script delivers the page which has been designed for humans.
There are two primary ways by
which the cloaking script can detect whether a search engine or a human being is visiting
a site:
i) The first and simplest way
is by checking the User-Agent variable. Each time anyone (be it a search engine spider or
a browser being operated by a human) requests a page from a site, it reports an User-Agent
name to the site. Generally, if a search engine spider requests a page, the User-Agent
variable contains the name of the search engine. Hence, if the cloaking script detects
that the User-Agent variable contains a name of a search engine, it delivers the page
which has been optimized for that search engine. If the cloaking script does not detect
the name of a search engine in the User-Agent variable, it assumes that the request has
been made by a human being and delivers the page which was designed for human beings.
However, while this is the
simplest way to implement a cloaking script, it is also the least safe. It is pretty easy
to fake the User-Agent variable, and hence, someone who wants to see the optimized pages
that are being delivered to different search engines can easily do so.
ii) The second and more
complicated way is to use I.P. (Internet Protocol) based cloaking. This involves the use
of an I.P. database which contains a list of the I.P. addresses of all known search engine
spiders. When a visitor (a search engine or a human) requests a page, the cloaking script
checks the I.P. address of the visitor. If the I.P. address is present in the I.P.
database, the cloaking script knows that the visitor is a search engine and delivers the
page optimized for that search engine. If the I.P. address is not present in the I.P.
database, the cloaking script assumes that a human has requested the page, and delivers
the page which is meant for human visitors.
Although more complicated than
User-Agent based cloaking, I.P. based cloaking is more reliable and safe because it is
very difficult to fake I.P. addresses.
Now that you have an idea of
what cloaking is all about and how it is implemented, the question arises as to whether
you should use page cloaking. The one word answer is "NO". The reason is simple:
the search engines don't like it, and will probably ban your site from their index if they
find out that your site uses cloaking. The reason that the search engines don't like page
cloaking is that it prevents them from being able to spider the same page that their
visitors are going to see. And if the search engines are prevented from doing so, they
cannot be confident of delivering relevant results to their users. In the past, many
people have created optimized pages for some highly popular keywords and then used page
cloaking to take people to their real sites which had nothing to do with those keywords.
If the search engines allowed this to happen, they would suffer because their users would
abandon them and go to another search engine which produced more relevant results.
Of course, a question arises
as to how a search engine can detect whether or not a site uses page cloaking. There are
three ways by which it can do so:
i) If the site uses User-Agent
cloaking, the search engines can simply send a spider to a site which does not report the
name of the search engine in the User-Agent variable. If the search engine sees that the
page delivered to this spider is different from the page which is delivered to a spider
which reports the name of the search engine in the User-Agent variable, it knows that the
site has used page cloaking.
ii) If the site uses I.P.
based cloaking, the search engines can send a spider from a different I.P. address than
any I.P. address which it has used previously. Since this is a new I.P. address, the I.P.
database that is used for cloaking will not contain this address. If the search engine
detects that the page delivered to the spider with the new I.P. address is different from
the page that is delivered to a spider with a known I.P. address, it knows that the site
has used page cloaking.
iii) A human representative
from a search engine may visit a site to see whether it uses cloaking. If she sees that
the page which is delivered to her is different from the one being delivered to the search
engine spider, she knows that the site uses cloaking.
Hence, when it comes to page
cloaking, my advice is simple: don't even think about using it.

Sumantra Roy is
a search engine positioning specialist on the Internet. For more articles on search engine placement, subscribe to
his 1st Search Ranking Newsletter by sending a blank email to mailto:1stSearchRanking.999.99@optinpro.com or by going to http://www.1stSearchRanking.com
[ICBS Knowledgebase Home]
[ICBS Home]
|