How to Compare Pages for Duplicate Content

Search engines now utilize duplicate content filters to decrease spam by filtering out, or removing, duplicate content from multiple websites or even on a single website. Duplicate content can be used in a deliberate effort to try to trick a search engine into returning better results or it can be done unknowingly.

When a search engine spider crawls a website, it “reads” the pages, stores the information, then compares that information to other findings stored in the database. From there it determines what it perceives as duplicate content and filters that content out. Pages that contain enough duplicate content to set off the filter are considered spam, even if they were never intended to be.

For example, if you have an eCommerce site and you use the manufacturer’s descriptions of products that are used by all of your competitor eCommerce sites selling the same products, that duplicate content is considered to be spam. Another example is that if you publish an article that is republished by many other sites, it can result in being seen as spam and filtered out.

Use the tool to avoid getting penalized for black hat SEO tactics by comparing specific pages.

The Similar Page Checker tool from webconfs.com allows you to compare content from two different URLs to determine the percentage of the content that is similar. Though the exact percentage of similar content that gets caught in the filter isn’t know and it varies among the search engines, it is wise to keep the similarity as minimal as you can. This tool can be very helpful to show you where that duplicate content is so you can make alterations to keep each page unique.