FediDB has stoped crawling until they get robots.txt support

mesamune@lemmy.world · 24 hours ago

FediDB has stoped crawling until they get robots.txt support

jmcs@discuss.tchncs.de · 22 hours ago

You can consent to a federation interface without consenting to having a bot crawl all your endpoints.

Just because something is available on the internet it doesn’t mean all uses are legitimate - this is effectively the same problem as AI training with stolen content.

hendrik@palaver.p3x.de · edit-2 22 hours ago

Yes. I wholeheartedly agree. Not every use is legitimate. But I’d really need to know what exactly happeded and the whole story to judge here. I’d say if it were a proper crawler, they’d need to read the robots.txt. That’s accepted consensus. But is that what’s happened here?

And I mean the whole thing with consensus and arbitrary use cases is just complicated. I have a website, and a Fediverse instance. Now you visit it. Is this legitimate? We’d need to factor in why I put it there. And what you’re doing with that information. If it’s my blog, it’s obviously there for you to read it… Or is it…!? Would you call me and ask for permission before reading it? …That is implied consent. I’d argue this is how the internet works. At least generally speaking. And most of the times it’s super easy to tell what’s right and what is wrong. But sometimes it isn’t.