Everyone dealing with search engine optimization (SEO) has heard about the need to avoid duplicate content. Despite this, nearly three in 10 sites have duplicate content issues, making this the fifth most common on-page SEO problem, a Raven Tools study found.
Part of the problem is that many SEO professionals are confused about what duplicate is, when it’s a problem, and what to do about it. Some myths about duplicate content have circulated despite Google’s efforts to correct them, amplifying confusion.
Managing duplicate content can get especially challenging if your brand has product pages and other content that span multiple platforms. If you’re a CMO trying to decide what to do about managing your brand’s duplicate content issues, here is some information and guidelines to assist you.
What Is Duplicate Content?
Google’s official guidelines on duplicate content define it as substantial content blocks on a website or across multiple sites that either entirely replicate other content or are significantly similar. Google distinguishes between malicious and non-malicious duplicate content.
Malicious duplicate content is content that tries to deceptively manipulate search engine results or drive more traffic by placing the same content on multiple sites. Google’s webmaster guidelines elaborate on what is considered malicious.
Malicious content is characterized by tricks that create low-quality content with little value for the end user. Examples include auto-generated content, scraped content, duplicates of articles that have already been published elsewhere, pages with minimal original content, irrelevant keywords, hidden text and links, cloaking, sneaky redirects, and link schemes.
Google also recognizes non-malicious forms of duplicate content. For instance, ecommerce sites may have the same product page shown or linked from distinct URLs. Many online articles have a printer-only version which provides the same content in printable form for user convenience. Discussion forums can have distinct versions for desktop and mobile users.
How Do Search Engines Handle Duplicate Content?
Google and other major search engines such as Yahoo and Bing discourage certain types of duplicate content because it’s bad for business. Search engines build their reputations on their ability to deliver listings of quality, unique search results. If a search turns up 10 pages of listings which all show the same content, it makes the value of the search effectively useless for users, lowering the value of the search engine that generated the results.
However, as noted, not all duplicate content is malicious, so search engines handle different malicious and non-malicious content differently. If Google notices a site using malicious duplicate content techniques, or receives a report complaining about a site, the site may be subject to a manual review. Reviews are prioritized based on user impact.
Depending on what the review determines, the site may be demoted in Google’s listings, removed from Google’s listings, or subject to other actions. Some actions do not have an effect that is immediately obvious. When Google performs a manual action on a site, the webmaster receives a notice. The webmaster may then correct the problem and submit a request for reconsideration by Google.
Google handles non-malicious duplicate content differently. Approximately 25 to 30 percent of online content is duplicate, says Google head of search spam Matt Cutts. Google regards most of this as normal and does not penalize it.
However, if there are multiple versions of your content online, Google still must prioritize which version to list in search engine results. It does this through a filtering process that compares versions, determines which one is the original version that was first crawled by Google, indexes that version, and removes other versions from listing indexes. Yahoo and Bing follow similar procedures.
How Do I Avoid Duplicate Content Issues?
To avoid having search engines treat your content as duplicate, there are some pre-emptive steps you can take. First and foremost, avoid any SEO tactics that could be interpreted as malicious. If you do get a notice that your content has been flagged for manual action by Google, have your webmaster take corrective measures promptly.
To avoid non-malicious duplicate content issues, there are also steps you should take. Develop unique content for your own site and off-site distribution channels whenever possible.
If you use content from other sources, such as articles that have been previously published, rewrite them or add something unique, such as your own commentary. For product descriptions, many sites borrow descriptions from product manufacturers verbatim; you can make descriptions unique through rewording.
For content that has multiple versions on your site, use your robots.txt file to block versions you don’t want indexed. If your site has a blog, adjust your settings so that your date and category archives don’t get indexed. Use 301 redirects to direct search engines to the proper version of pages that have been moved.
If you syndicate content to other sites, make sure you publish it first on your own site or social profile. The first version published will typically be treated as original by search engines.
How Do I Manage Duplicate Content Issues Across Multiple Platforms?
If you have to manage similar content across multiple platforms, such as product pages featured on different sites, your task can be more complicated.
If you have the same content on multiple platforms, Google will not penalize you, but does tend to prioritize one version of your content, which can dilute the authority you accumulate by distributing it across your platforms, says Google’s Greg Grothuas.
When this happens, Google may send you a message via Google Search Console to let you know which version has been prioritized.
There are a few strategies provided by Google you can use to address this. First, don’t necessarily expect all versions of your content to be indexed, but be prepared to prioritize. You can use cross-domain rel=”canonical” elements and 301 redirects to point search engines towards the version you want to prioritize. You can also edit duplicate versions to make them more unique.
In order to effectively edit duplicate content hosted across multiple platforms, you need to implement ecommerce product page best practices. Content Analytics’ platform, for example, allows you to update all your content and do product page optimization from a single location and syndicate your product content to all your major retail channels.
An ecommerce optimization platform also allows you to track your analytics for all your online properties from one location so that you can see if some versions of your content are losing traffic due to duplicate content issues, which allows you to more rapidly make corrective adjustments.
Duplicate content doesn’t have to be an issue if you understand how search engines handle it and how you can adapt to follow the rules. Avoiding malicious content tactics, keeping content unique, and letting search engines know which versions of content you want prioritized will help you minimize potential problems.
If you need to manage similar content across multiple platforms, using cross-domain canonicalization and redirects and an ecommerce optimization platform can help make your job easier.
Over to You
How do you handle duplicate content issues?