@fgtech The problem is that having your site indexed by a search engine and having it pulled by a dozen outfits a day collecting training data for their ML models are qualitatively different things, but are both affected by robots.txt.
... works as a web developer in Hveragerði, Iceland, and writes about the web, digital publishing, and web/product development
These are his notes
@fgtech The problem is that having your site indexed by a search engine and having it pulled by a dozen outfits a day collecting training data for their ML models are qualitatively different things, but are both affected by robots.txt.