We have paused all crawling as of Feb 6th, 2025 until we implement robots.txt support. Stats will not update during this period.

  • jmcs@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    4
    ·
    22 hours ago

    You can consent to a federation interface without consenting to having a bot crawl all your endpoints.

    Just because something is available on the internet it doesn’t mean all uses are legitimate - this is effectively the same problem as AI training with stolen content.

    • hendrik@palaver.p3x.de
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      22 hours ago

      Yes. I wholeheartedly agree. Not every use is legitimate. But I’d really need to know what exactly happeded and the whole story to judge here. I’d say if it were a proper crawler, they’d need to read the robots.txt. That’s accepted consensus. But is that what’s happened here?

      And I mean the whole thing with consensus and arbitrary use cases is just complicated. I have a website, and a Fediverse instance. Now you visit it. Is this legitimate? We’d need to factor in why I put it there. And what you’re doing with that information. If it’s my blog, it’s obviously there for you to read it… Or is it…!? Would you call me and ask for permission before reading it? …That is implied consent. I’d argue this is how the internet works. At least generally speaking. And most of the times it’s super easy to tell what’s right and what is wrong. But sometimes it isn’t.