Disclosure: Please note that this article may contain affiliate links. You can read my full disclosure here.Need help with the Tech-side of blogging? Join my Facebook Group and get some answers.
Stolen blog posts, taken word for word, without your permission, is usually considering “scraping”, it is copyright infringement. Google is aware and from what I have read online does not penalize you for the spammers that steal your posts as duplicate content – in our case they even left the links back to our site for our related posts, among other things.
Hey, I just updated this post with a video from Income School (see below). They found an ingenius method to deal with this, especially when it is difficult to prove the other post is stolen.
How do you deal with Content Scrapers Stealing Blog Posts
One option that you can use is to look up their hosting provider by their domain doing a whois search https://ca.hostadvice.com/tools/whois/, and file a DMCA with their host, that is an option to you. In some cases it is very difficult to locate the host to contact them, you will need to decide for yourself how far you want to take this violation.
In our case for stolen blog posts, the perpetrators were using the Interserver.net hosting, so we looked at their site and found the steps to file a DMCA. Some hosts will have an online form to fill out, but in this case we needed to send an email. https://www.interserver.net/tips/kb/dcma/
The format we found online to send an email of this nature was the same information for if we wanted to email Pinterest to request the DMCA by email.
Here is a sample version of the email we sent the host. You will notice that this is very similar information to what you need to state for the Pinterest DMCA.
My name is *insert name* and I am the Blogger at example.com. A website that your company hosts (according to WHOIS information) is infringing on at least one copyright owned by example.com.
An article was copied onto your servers without permission. The original ARTICLE/PHOTO, to which we own the exclusive copyrights, can be found at:
The unauthorized and infringing copy can be found at:
This letter is official notification under Section 512(c)
Please also be advised that law requires you, as a service provider, to remove or disable access to the infringing materials upon receiving this notice. Under US law a service provider, such as yourself, enjoys immunity from a copyright lawsuit provided that you act with deliberate speed to investigate and rectify ongoing copyright infringement. If service providers do not investigate and remove or disable the infringing material this immunity is lost. Therefore, in order for you to remain immune from a copyright infringement action you will need to investigate and ultimately remove or otherwise disable the infringing material from your servers with all due speed should the direct infringer, your client, not comply immediately.
I am providing this notice in good faith and with the reasonable belief that rights example.com owns are being infringed. Under penalty of perjury I certify that the information contained in the notification is both true and accurate, and I have the authority to act on behalf of the owner of the copyright(s) involved.
Should you wish to discuss this with me please contact me directly.
/s/First Last name.
PO BOX 123,
Some Town, AA, Zip or Postal Code
There is a post on WP Beginner that discusses 3 options: do nothing, block all, or use it to your advantage for dealing with content scraping, here https://www.wpbeginner.com/beginners-guide/beginners-guide-to-preventing-blog-content-scraping-in-wordpress/
What I found most interesting was the suggestion of redirecting the scraper back to his/her own host, creating an endless loop and crashing their server. Being they are likely on a shared server that would likely affect other sites unfortunately hosted with them, so we won’t go there on this post.
Block Scrapers from stealing your content with Wordfence
If you are using Wordfence, you can block scrapers manually. There is no reason for anyone to visit your site from other hosting providers, using a python script etc. After your changes you will want to test your Facebook Debugger, Pinterest Pin Validator, Twitter Card Validator etc to confirm that the valid scrapers for you to share your content are working.
So, you can go into the Wordfence Plugin, (free version) will work, there is additional security available in the Premium version. Select Firewall. Then Blocking. From here, enter a custom pattern. For example this is what we entered to block all useragents that are using python scripts.
Block specific hosting providers:
I haven’t tried blocking my own host provider, it is up to you, but until I test it, I will not block my own host on the firewall with this method.
My thinking with this method, is that someone installed WordPress or another Content Manager (CMS), and installed a plugin to scrape content. I did find some bot visits to our main site from these hosts. I’d rather stop these bots that have no reason to be on my site, then allow them and keep the door open to stolen content. If you disagree or know a valid reason for this traffic leave a comment and we can discuss it.
*.healthystyle.* (this one seems to be the host listed for sites on interserver)
add as many as you would like, this is not an exhaustive list. I went through my live traffic log on Wordfence and found several others out of country as well, most were using Python to scrape our website, or run other scripts without our knowledge.
Be very careful to not add internet service providers, this is where your real traffic is coming from.
Love these guys, here is how Income School dealt with stolen blog posts.
Have you dealt with stolen blog posts or content? Which option did you choose, let’s discuss in the comments.