In light of the recent COVID-19 pandemic - OPTASY would like to offer DRUPAL website support for any Health Care, Government, Education and Non-Profit Organization(s) with critical crisis communication websites or organizations directly providing relief. Stay Safe and Stay Well.

How Do You Deal with Duplicate Content in Drupal? 4 Modules to Get this Issue Fixed

How Do You Deal with Duplicate Content in Drupal? 4 Modules to Get this Issue Fixed

by Adriana Cacoveanu on Jan 16 2019

Accidentally creating duplicate content in Drupal is like... a cold: 

Catching it is as easy as falling off a log.

All it takes is to:

 

  • further submit your valuable content on other websites, as well, and thus challenging Google with 2 or more identical pieces of content
  • move your website from HTTP to HTTPs, but skip some key steps in the process, so that the HTTP version of your Drupal is still there, “lurking in the dark”
  • have printer-friendly versions of your Drupal site and thus dare Google to face another duplicate content “dilemma”

     

So, what are the “lifebelts” or prevention tools that Drupal “arms” you with for handling this thorny issue?

Here are the 4 modules to use for boosting your site's immunity system against duplicate content.

And for getting it fixed, once the harm has already been made:

 

1. But How Does It Crawl into Your Website? Main Sources of Duplicate Content 

Let's get down to the nitty-gritty of how Drupal 8 duplicate content “infiltrates” into your website.

But first, here are the 2 major categories that these sources fall into:

 

  • malicious
  • non-malicious

     

The first ones include all those scenarios where spammers post content from your website without your consent.

The non-malicious duplicate content can come from:

 

  • discussion forums that create both standard and stripped-down pages (for mobile devices)
  • printer-only web page versions, as already mentioned
  • items displayed on multiple pages of the same e-commerce site

     

Also, duplicate content in Drupal can be either:

 

  • identical
  • or similar

And since it comes in “many stripes and colors”, here are the 7 most common types of duplicate content:

 

1.1. Scraped Content

Has someone copied content from your website and further published it? Do not expect Google to distinguish the copy from its source.

That said, it's your job and yours only to stay diligent and protect the content on your Drupal site from scrapers.

 

1.2. WWW and non-WWW Versions of Your Website

Are there 2 identical version of your Drupal website available? A www and a non-www one?

Now, that's enough to ring Google's “duplicate content in Drupal” alarm.

 

1.3. Widely Syndicated Content 

So, you've painstakingly put together a list of article submission sites to give your valuable content (blog post, video, article etc.) more exposure.

And now what? Should you just cancel promoting it?

Not at all! Widely syndicated content risks to get on Google's “Drupal 8 duplicate content” radar only if you set no guidelines for those third-party websites.

That is when these publishers don't place any canonical tags in your submitted content pointing out to its original source.

What happens when you overlook such a content syndication agreement? You leave it entirely to Google to track down the source. To scan through all those websites and blogs that your piece of content gets republished on.

And often times it fails to tell the original from its copy.

 

1.4. Printed-Friendly Versions

This is probably one of the sources of duplicate content in Drupal that seems most... harmless to you, right?

And yet, for search engines multiple printer-friendly versions of the same content translates as: duplicate pages.

 

1.5. HTTP and HTTPs Pages

Have you made the switch from HTTP to HTTPs?

Entirely?

Or are there:

 

  • backlinks from other websites still leading to the HTTP version of your website?
  • internal links on your current HTTPs website still carrying the old protocol?

     

Make sure you detect all these less obvious sources of identical URLs on your Drupal website.

 

1.6. Appreciably Similar Content 

Your site's vulnerable to this type of duplicate content “threat” particularly if it's an e-commerce one.

Just think of all those too common scenarios where you display highly similar product descriptions on several different pages on your eStore. 

 

1.7. User Session IDs 

Users themselves can non-deliberately generate duplicate content on your Drupal site. 

How? They might have different session IDs that generate new and new URLs.

2. 4 Modules at Hand to Identify and Fix Duplicate Content in Drupal

What are the tools that Drupal puts at your disposal to detect and eliminate all duplicate content?

 

2.1. Redirect Module

Imagine all the functionality of the former Global Redirect module (Drupal 7) “injected” into this Drupal 8 module!

In fact, you can still define your Global Redirect features by just:

 

  1. accessing the Redirect module's configuration page
  2. clicking on “URL redirects” 

     

How to Deal with Duplicate Content in Drupal: Global Redirect features

Image Source: WEBWASH.net

What this SEO-friendly module does is provide you with a user-friendly interface for managing your URL path redirects:

 

  • create new redirects
  • identify broken URL paths (you'll need to enable the “Redirect 4040” sub-module for that)
  • set up domain level redirects (use the “Redirect Domain” sub-module)
  • import redirects

     

Summing up: when it comes to handling duplicate content in Drupal, this module helps you redirect all your URLs to the new paths that you will have set up.

This way, you avoid the risk of having the very same content displayed on multiple URL paths.

 

2.2. Taxonomy Unique Module  

How about “fighting” duplicate content on your website at a vocabulary level?

In this respect, this Drupal 8 module:

 

  • prevents you from saving a taxonomy term that already exists in that vocabulary
  • is configurable for every vocabulary on your Drupal site
  • allows you to set custom error messages that would pop up whenever a duplicate taxonomy term is detected in the same vocabulary

     

2.3. PathAuto Module  

Just admit it now:

How much do you hate the /node125 type of URL path aliases?

They're anything but user-friendly.

And this is precisely the role that Pathauto's been invested with:

To automatically generate content friendly path aliases (e.g. /blog/my-node-title) for a whole variety of content.

Let's say that you want to modify the current “path scheme” on your website with no impact on the URLs (you don't want the change to affect user's bookmarks or to “intrigue” the search engines).

The Pathauto module will automatically redirect those URLs to the new paths using any HTTP redirect status.

 

2.4. Intelligent Content Tools      

Personalization is key when you strive to prevent duplicate content in Drupal, right? 

And this is precisely what this module here does: it helps you personalize content on your website.

How? Through its 3 main functionalities delivered to you as sub-modules:

 

  • auto tagging
  • text summarizing 
  • detecting plagiarized content 

     

Leveraging Natural Language Processing, this last sub-module scans content on your website and alerts you of any signs of duplicity detected.

Word of caution: keep in mind that the module is not yet covered by Drupal's security advisory policy!

 

3. To Sum Up

Setting a goal to ensure 100% unique content on your website is as realistic as... learning a new language in a week. 

Instead, you should consider setting up a solid strategy ”fueled” by (at least) these 4 modules “exposed” here. One that would help you avoid specific scenarios where entire pages or clusters of pages get duplicated.

Now, that's a far less utopian goal to set, don't you think?

Development

We do Web development

Go to our Web development page!

Visit page!

Recommended Stories

7 Ways that You Can Reduce Image Size on a Large-Scale Drupal Site, With a Lot of Images
Images can make or... break the user experience. Especially if we're talking about large amounts of images. So, what are your best options to reduce image size on a large-scale Drupal 8 site? And, most importantly: How do you strike a balance between the smallest file size and the best possible image quality? Well, just keep on reading... Here are 7 ways that you can manage image compression on your image-heavy Drupal site without affecting their quality: 1. Use a PNG Optimizer to Reduce Image File Size Let's say that you have a lot of PNG images on your Drupal website. How do you reduce the image file size? Simple: you use TinyPNG to... squeeze them. Expect to reduce them somewhere around 60% (while keeping the lossless image quality). 2. Use a JPEG Compressor to... Compress Your Images Server-Side  "How to make images load faster on my website?" Just use a compressor to reduce image size for your JPG and JPEG files: TinyJPG CompressJPEG Compress Now JPEGOptim It will take... a while, especially since we're talking about an image-heavy Drupal site. Different JPG files mean different settings for you to... play with till you've balanced out size and quality. 3. How to Improve Image Load Times: Use Drupal 8 Image Styles It's a tool that you get... out of the box. And what you gain is more control over the size of the images on your website: Set several image styles, of different sizes, that will go on various areas of a page. Good to know! Configure your image styles just once: from then on, they'll resize all your new images by default. Here's how you do it: Go to "Manage display", in the content type setup section Click the gear wheel icon next to the image field to open the settings tab and choose an image style There, you can either choose one of the previously configured image styles or... create a new one, by selecting the "Configure Image Style" option from the dropdown menu Source: Drupal.org          Good to know! Once you've set your image style, the module updates all the created images... automatically. 4. Use Drupal 8 Image Toolkit to Adjust Their Quality (And Their Size) It's the best way to resize images in Drupal 8. And the easiest way, as well: Go to Admin > Configuration > Media > Image Toolkit Choose the setting that allows you to compress your images (the JPEG quality field) Play with it till your strike the perfect size-quality balance Save the new settings Note! The new setting will apply to all your images; there's no way for you to adjust the quality for each image, one by one. Tip! Stick to somewhere between 60% and 80% when setting the image quality. 5. Use the Responsive Images Module to Resize Your Images  "Which module is used for image optimization?" Which is the best Drupal 8 module to reduce image size on pages that contain lots of images... Responsive Images is (but) one of them. Here's how it works: Its image formatter maps the breakpoint of the original image and renders a responsive image instead. All that by using an HTML5 picture tag (that has sizes and srcset attributes). It basically enables browsers to select the image to display according to the image style selections. And here's how you set up responsive images on your Drupal 8 website: Enable the module (for, even if it's a core module, it's not enabled by default) in Admin > Configuration Select "Responsive Image" Hit "Install" What about breakpoints? How do you set them up? Go to your editor (if it's a custom theme that you're using) Create a file named "yourthemename.breakpoints.yml" in your theme directory ("/themes/custom/yourthemename") Now, its time to configure the image styles for your responsive images: Different breakpoints call for... different image sizes.  Go ahead and pair each breakpoint that you set up at your_theme_name.breakpoints.yml with an image style and create your responsive image styles. 6. Use the ImageMagick Module to Reduce Image Size   "How do I manage compression on my image-heavy Drupal site?" You use ImageMagick. It's another one of your best options for Drupal 8 image resize. Here's why: Drupal might provide the GD2 image manipulation toolkit out of the box and enable you to set multiple alternatives, of different sizes, for the same image. Yet, it lacks some key features, such as... TIFF format support or GIF support with an image style. And this is where the ImageMagick module comes in handy. To install it, just run this command: Composer require 'drupal/imagemagick' And here's how you use it to reduce image size: enable it via ‘yoursite/ path set the quality for your image to 100% in the ImageMagick image toolkit The results? you've just enabled the GIT format support with image style you've reduced your image size by 20-40% Mission accomplished... 7. If It's Still Not Enough, Lazy Load Your Images  "How to load images faster?" Use the Lazy Loader, the ultimate Drupal 8 image optimization solution. Have you tried them all — all the dedicated modules and Drupal core features available — and you're still not satisfied with how fast your image-heavy pages load? Then go even further and incorporate a lazy loading functionality into your website. The END! But maybe you have better things to do — a business strategy to improve, urgent projects to work on — than finding and implementing the best solution to reduce image size on your Drupal website. Yet, you still want to make your pages load faster... So, just pass on the "burden" to us! We'll identify the best solution for speeding things up on your image-heavy web pages and... implement it for you sitewide. Image by Alexandra_Koch from Pixabay   ... Read more
Silviu Serdaru / Jul 23'2020
Drupal Performance Optimization: 17 Drupal Caching Best Practices To Speed Up Your Page Load Time- Part 2
"How can I make my Drupal 8 website faster?" Are you still struggling with this? Still striving to figure out which are the best (and most straightforward) Drupal performance optimization techniques for your website? Well, here I am today with a handful of 9 more ways that you can speed up your Drupal site. In addition to the 8 ones that I covered in the first part of this post. And yes: it's another round of Drupal caching best practices that'll help you boost your page load time. So, let's dive right into it: Tip #9: Use the Dynamic Page Cache Module  ... to cache for both authenticated and anonymous users. Unlike the Internal Page Cache module, that I mentioned in Part 1, which only caches pages for anonymous users. Tip #10: Use Distributed Cache, A Highly Effective Drupal Performance Optimization Technique But how does it work, more precisely? Once you've installed a distributed cache, it'll store your database's cache tables (Drupal's "cache_" tables) either in: file or memory Tip #11: Enable Drupal Cache for Anonymous Users Another one of those quick, yet powerful Drupal performance tuning steps that you can take. Tip #12: Use Squid to Cache Images and Static Content on Your Website "How to optimize Drupal for better performance?" You could go for Squid, an open-source caching proxy server. Now, since Drupal's already famed for its particularly dynamic content, the only cases where Squid does make a great performance booster are those where you need to cache static content. Tip #13: Add a Front-End Cache (i.e.Varnish Cache) Here's another handy Drupal performance optimization method for you: Use Varnish Cache to reduce the load on your server. How does it do it? It stores the HTML response, so that next time that the same page is requested, it serves it from memory. The result? Bypassed PHP and web server and... improved page load time. Tip #14: Use the Advanced CSS/JS Aggregation Module to Improve the Front-End Performance of Your Website  Combining your assets together is one of the most straightforward and effective ways to address those Drupal performance issues on your website. From: file grouping to caching to compressing ... the AdvAgg module handles all the steps that you need to take to aggregate your CSS and JS files. Tip #15: Install Memcache to Reduce Your Database Load You know how you're often struggling with keeping your database load to a minimum by caching database objects in RAM? In this respect, Memcache makes a great Drupal 8 performance optimization technique. It helps you reduce that load on the database and boost your page loading time. How? By taking standard caches out of the database. And by caching the results of resource-intensive database operations... Tip #16: Use the Entity Cache Module to Cache... Entities   Another caching best practice to boost Drupal 8 with is installing the Entity Cache module.  And its name says it all: it helps you cache entities. Tip #17: Cache Views  Here's the situation: Page requests made by registered users on your website lead to loads of queries to your database. Which impact the page load time. Now, to query the database, views are being used. And this is where this views caching module comes in handy to... boost things in there.   The END! These are our 17 recommendations for you on the best Drupal performance optimization methods for boosting your page load time. Not thrilled about the idea of having to go through the... Memcache installation process or to configure Varnish for Drupal? Or to put your current projects on hold so that your team can set up a... distributed cache? Maybe you don't have a professional Drupal maintenance team that could handle all these caching settings? We're here to help! Just drop us a line and let's figure out which of these 17 techniques are best suited for your website and the specific performance issues that it's struggling with. Let's speed things up in there! Image by Izwar Muis from Pixabay   ... Read more
Silviu Serdaru / Jun 23'2020
Drupal Performance Optimization: 17 Drupal Caching Best Practices To Speed Up Your Page Load Time- Part 1
"Why is my Drupal site so slow?" "How do I speed up my Drupal website performance?" In other words, what Drupal performance optimization techniques should you use? Which is the: most budget-friendly quickest most straightforward most effective ... solution to those Drupal performance issues that are slowing down your website? Caching... And luckily, Drupal 8 (it is a Drupal 8 website that you have, isn't it?) "spoils" you with one of the most advanced caching systems out there. The trick is that you follow the Drupal caching best practices and use it to its full potential. Speaking of which, here's a list of 17 such best practices: * I'll be covering 8 of them in this post, leaving the 9 remaining ones for the next blog post. But First: What Is Caching? "What is the purpose of caching?" "How does caching improve performance?"  2 legitimate questions that you might be yourself right now. Let me start by defining the Drupal caching process: Once a user accesses a page on your website, content elements and web data from that specific page (images, HTML, CSS, etc.) get stored in an accessible space. When that user visits the same web page again, your website will serve him/her the cached version of the content.  That if you haven't updated it since his/her last visit, of course... And this translates into: reduced bandwidth faster page loads Tip #1: Use the Internal Page Cache Module to Cache Pages for Anonymous Users   Say you have an "Add to cart" functionality for anonymous users on your eCommerce website. You can use this module to cache precisely this functionality. A Drupal performance optimization tweak that'll take you less than a minute to set up. Tip #2: Go for the Best Suited Tools for Heavy Traffic Drupal Sites Say you have a fairly busy Drupal 8 website. You've turned on caching in your performance settings, but... you haven't noticed any significant impact on your site's loading speed. So, you need to bring in the heavy artillery. To use powerful caching tools designed for high traffic websites. Here are some of the best tools and optimization techniques to try: switch to a Drupal-specialized hosting provider like Pantheon or Acquia move your database to its own VM/container (that if you still have it running locally, on your Drupal web server) upgrade to PHP 7.1.0 Enable OPcache via php.ini.  Put a proxy (i.e. Nginx) in front of your server Tip #3: Enable Block Cache - A Quick and Easy Drupal Performance Optimization Solution How to increase Drupal 8 performance? You cache those blocks that don't get updated frequently (like from one user to another). Tip #4: Use Views Content Cache to Update Upon Content Changes Only How does this Drupal module help you optimize your website for better performance? It allows you to expire views caches every time you update or remove content. The great thing about this caching method is that you get to cache blocks that appear on thousands of pages. Tip #5: Use a Content Delivery Network By far the most powerful Drupal performance optimization solution for your website. Why? Here are the 2 strongest reasons why you'd want to use a CDN to cache the static content (files, CSS, images, JS, fonts...) on your website: you keep the network delay to a minimum since your CDN has endpoints across the globe you get a better page loading time: your CDN has a domain different from your website's, so web browsers load content requests to your domain in parallel with the content coming from the CDN Tip #6: Set a Far Future Expiration Date for Your Static Assets Set up a "Newer expire" policy for your static components (e.g. use a far future Expires header) Tip #7: Use Redis as a Drupal Performance Optimization Technique to Store Large Amounts of Data Data that wouldn't fit into your server. "But what is Redis?" you ask? An in-memory store optimized for high-performance. Tip #8: Set the Maximum Time that Your Pages Can Remain Cached Another one of the Drupal caching best practices is setting the maximum amount of time that browsers should keep your cached data. The END of Part 1! And these are but 8 Drupal performance optimization solutions focused on caching. I have a whole list of 17 tips ready to share with you... So, stay tuned for another round of simple and effective caching techniques that'll help you speed up your website... But what if you don't have the time or the people in your team that you could assign tasks like: enable a block cache set up Redis  install the... views_content_cache module ...?  What if you could have a dedicated Drupal maintenance team implement all these performance optimization techniques on your website for you? We're ready to help you speed things up on your website. Drop us a line and let's set up the best caching strategy for your Drupal website. Image by mohamed Hassan from Pixabay   ... Read more
Silviu Serdaru / Jun 19'2020