Analyzing One Million robots.txt Files🔒 intoli.com
This is a delightfully fun read analyzing the
robots.txt files of the top million websites. Interesting history about the origins of the robots specification and the fact that it was never standardized. There isn’t even an RFC for it! As a bonus the code excerpts to show the analysis are all in Python and was the first time I’ve seen the collections module.