robots-txt

Tag

Cards List
#robots-txt

Each AI agent crawls website completely differently. Here's what 3 mons of 11 million event logs actually show.

Reddit r/AI_Agents · 7h ago

Analysis of 11 million crawler logs across 34 websites reveals distinct behaviors: GPTBot crawls relentlessly ignoring robots.txt, Google's bot checks rules frequently, ClaudeBot's crawling is rapidly accelerating, and Bytespider is the heaviest crawler. The findings suggest a shift from Google-centric SEO to optimizing for AI agent page selection.

0 favorites 0 likes
#robots-txt

AI Makes Large-Scale Web Scraping Accessible. Is That a Problem?

Reddit r/ArtificialInteligence · 2026-06-02

The article discusses how AI coding assistants make large-scale web scraping accessible to ordinary people, raising ethical concerns about ignoring robots.txt and rate limits, and questions the responsibility of AI providers.

0 favorites 0 likes
#robots-txt

How does AI follow ethical guidelines in Data Collection?

Reddit r/artificial · 2026-06-02

A commentary on the ethical challenges of AI agents ignoring website rules like robots.txt when generating scrapers, and the responsibility of AI providers to implement guardrails without hindering product usability.

0 favorites 0 likes
#robots-txt

Spent an afternoon making my site more AI friendly. The next day AI traffic went 12x

Reddit r/AI_Agents · 2026-05-19

A developer optimized their website for AI bots by fixing robots.txt, adding llms.txt, improving semantic HTML, and more, resulting in a 12x increase in AI traffic the next day.

0 favorites 0 likes
#robots-txt

Amazonbot is finally respecting robots.txt

Hacker News Top · 2026-05-14 Cached

Amazonbot, Amazon's web crawling bot, now respects robots.txt directives, marking a change in its previous behavior.

0 favorites 0 likes
← Back to home

Submit Feedback