I understand that if I put an entry in robots.txt such as:
Disallow: /tmp/
then spiders will not crawl that directory. In older versions of Joomla, Disallow: /components/ was a standard line. I noticed (and I'm not sure when this happened) that this directory no longer appears in a vanilla Joomla install, neither do /modules/ nor /plugins/. I wonder why these were removed?
The issue I'm getting is that Google is now sending me emails about various component URLs being 404s - such as the URL https://www.domainname.com/component/convertforms. So I have been manually adding the "missing" entries (the 3 mentioned above) since I discovered this.
Today, Google sent me an email telling me these URLs are blocked by robots.txt:
https://www.domainname.com/component/convertforms
https://www.domainname.com/component/ajax/?format=json
Again, I know this is Google trying to be "helpful", but I already know what I've added to robots.txt and it's just an annoyance.
I doubt there's a way to exclude Google from reporting "Blocked by robots.txt" (if there is i'd love to know), but can anyone explain why /components/, /modules/, and /plugins/ are removed from a vanilla Joomla install?
Disallow: /tmp/
then spiders will not crawl that directory. In older versions of Joomla, Disallow: /components/ was a standard line. I noticed (and I'm not sure when this happened) that this directory no longer appears in a vanilla Joomla install, neither do /modules/ nor /plugins/. I wonder why these were removed?
The issue I'm getting is that Google is now sending me emails about various component URLs being 404s - such as the URL https://www.domainname.com/component/convertforms. So I have been manually adding the "missing" entries (the 3 mentioned above) since I discovered this.
Today, Google sent me an email telling me these URLs are blocked by robots.txt:
https://www.domainname.com/component/convertforms
https://www.domainname.com/component/ajax/?format=json
Again, I know this is Google trying to be "helpful", but I already know what I've added to robots.txt and it's just an annoyance.
I doubt there's a way to exclude Google from reporting "Blocked by robots.txt" (if there is i'd love to know), but can anyone explain why /components/, /modules/, and /plugins/ are removed from a vanilla Joomla install?
Statistics: Posted by trogladyte — Fri Oct 11, 2024 5:35 pm