[-] lily33@lemmy.world -4 points 1 year ago

If you give me several paragraphs instead of a single sentence, do you still think it's impossible to tell?

[-] lily33@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

I don't see how that affects my point.

  • Today's AI detector can't tell apart the output of today's LLM.
  • Future AI detector WILL be able to tell apart the output of today's LLM.
  • Of course, future AI detector won't be able to tell apart the output of future LLM.

So at any point in time, only recent text could be "contaminated". The claim that "all text after 2023 is forever contaminated" just isn't true. Researchers would simply have to be a bit more careful including it.

[-] lily33@lemmy.world 0 points 1 year ago

They don't redistribute. They learn information about the material they've been trained on - not there natural itself*, and can use it to generate material they've never seen.

  • Bigger models seem to memorize some of the material and can infringe, but that's not really the goal.
[-] lily33@lemmy.world 2 points 1 year ago* (last edited 1 year ago)

It's specifically distribution of the work or derivatives that copyright prevents.

So you could make an argument that an LLM that's memorized the book and can reproduce (parts of) it upon request is infringing. But one that's merely trained on the book, but hasn't memorized it, should be fine.

[-] lily33@lemmy.world -5 points 1 year ago* (last edited 1 year ago)

Why should such a thing be assumed????

[-] lily33@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

It's actually a real problem on reddit where people spin up fake users to manipulate votes. Reddit hasn't published how they detect that exactly, but one way to do that is to look for bad voting patters, like if one account systematically upvotes/downvotes another. But you pretty much can't without knowing the votes.

[-] lily33@lemmy.world 1 points 1 year ago

HuggingFace looks to me like it's a corporation. Like, when I click on "about > join us", I'm sent to their job offer page.

view more: ‹ prev next ›

lily33

joined 1 year ago