@odd Doesn't look like it. They don't mention copyright or trademarks at all. Seems to be mostly fraud, consumer protection, and discrimination stuff that they're investigating. Must mean there's a lot of AI grift, fraud, and nonsense going on for them make a statement like this.

@fgtech Does robots.txt let you block by use case? Otherwise you’d have to preemptively block a potentially infinite list of user agents

Also inclusion in an ML training data set should be opt-in, especially if the download utility in question wants to comply with the GDPR.

@fgtech The problem is that having your site indexed by a search engine and having it pulled by a dozen outfits a day collecting training data for their ML models are qualitatively different things, but are both affected by robots.txt.

@odd That’s quite different from a random newspaper using it to create an outright fake interview without permission.

@ChrisJWilson It’s the big tech and AI companies that are glossing over these issues because ethics get in the way of making money. See Microsoft’s disbanding of their AI ethics team and Google firing Timnit Gebru. The people who know, and in many case funded the ethics research are the ones downplaying it.

@mbkriegh yup. And since the legal definition of plagiarism is a bit broader than statistical similarity it seems very likely that more than 2% of the images generated by Stable Diffusion would be infringing.

