How To Stop The Site Scrapers

One problem that a lot of webmasters with good content have is that of “scrapers” stealing their content and using it on other sites. Aside from being more than just a little bit cheeky, this can also cause the originating webmaster harm in Google rankings by effectively duplicating and, thus in Google’s eyes, devaluing their content. So how do you prevent this? Read on…

One way of tackling “manual” scraping is to use the “anti-select” method – a piece of javascript that detects someone either highlighting text on your page or right clicking to “view source”. But this can hinder genuine attempts to quote you and it doesn’t stop automated scrapers which form the bulk of thieves.

A second and more effective way is to use Copyscape (or their automated “Copysentry” service), which keeps a track on your pages and looks around the web for duplicates or partial duplicates. This is a good and popular service and seems to work well and Copysentry will notify you when dupes are found.

Another method that I have devised is to insert a link and credit at a strateguc point in an article using CSS to make the text the same colour as the background and using HTML codes to render the < and > symbols in the link. To see what I mean, select the text on this page and copy it into a text editor – you will see a Copyright link that you won’t see by just viewing this page!

This has the advantage that manual scrapers will unknowingly take uyour link and link to you. The intelligent ones might remove the link, but will probably still leave the anchor text in place making it easier for you to track who’s copying you. Do a manual check every now and again for copies and change your text of the article if you think you are being significantly duplicated. But, a warning if you use this method…

Use it in moderation. Firstly there is little point in using it unless you have a high chance of being scraped – have good rankings. Alos, don’t put lots of links or hidden text in as Google can view significant content that is hidden from view as spammy and you could get penalised in the rankings.

<a href=”http://www.iansims.com”>Copyright http://www.iansims.com</a>

So there you have it. Copyscape is the best solution in my opinion, but not everyone has the time to monitor it. Good luck.

Comments are closed.