Created
June 23, 2024 18:46
-
-
Save endurtech/3db12de2af978d1f643b6d62ae4d4bbe to your computer and use it in GitHub Desktop.
The facebookexternalhit/1.1 is a user agent used by Facebook to crawl and index web pages for its various services, such as Facebook, Instagram, and WhatsApp. This crawler is responsible for retrieving content, images, and other metadata to improve Facebook's search functionality and provide users with relevant results. However, the facebookexte…
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # BLOCK Facebook Crawler | |
| # https://endurtech.com/block-facebook-crawler-facebookexternalhit/ | |
| RewriteEngine On | |
| RewriteCond %{HTTP_USER_AGENT} ^facebookexternalhit/1\.1 [NC] | |
| RewriteRule ^ - [F,L] | |
| # BLOCK Facebook Crawler END |
Author
To the best of my knowledge, there isn't a reliable way to limit facebookexternalhit. In most cases, we're fortunate if it even respects the robots.txt file. However, I've read that some developers have successfully implemented throttle scripts to reduce how frequently Facebook's bot accesses their sites. These scripts monitor the timing between requests and deny access if they occur too rapidly. Read more here: https://stackoverflow.com/questions/11521798/excessive-traffic-from-facebookexternalhit-bot
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, I was wondering, can it be limited rather than entirely blocked?