NY Times copyright suit wants OpenAI to delete all GPT instances (arstechnica.com)

submitted 10 months ago by BrikoX@lemmy.zip to c/technology@lemmy.zip

9 comments fedilink hide all child comments

all 10 comments

sorted by: hot top controversial new old

[-] charonn0@startrek.website 10 points 10 months ago

I'm sympathetic to the NYT, even if it's not reproducing their IP verbatim.

AI companies need to acknowledge that their LLMs would be worthless without training data and compensate/credit the sources appropriately.

[+] j4k3@lemmy.world 5 points 10 months ago* (last edited 10 months ago)

[deleted]

[-] SexyVetra@lemmy.world 8 points 10 months ago

Thanks for reading the article and addressing the claims instead of making up stuff to be mad about...

Oh wait... 🙄

[-] urist@lemmy.blahaj.zone 6 points 10 months ago

Large language models are just like humans.

…humans don’t accidentally plagerize whole articles. They also understand the difference between theft and fair use, and AI has been shown to not respect that distinction multiple times. You can also sue humans for damages when they steal from you. Apparently LLM are immune to legal liability because oopsie poopsie mistakes happen uwu.

LLMs are cool and useful, but if they’re harming the data sources they wouldn’t exist without, shouldn’t we do something?

[-] Axiochus@lemmy.world 4 points 10 months ago

I've been teaching academic writing for the last ten years and would strongly object to your first two assertions 😄

[-] urist@lemmy.blahaj.zone 1 points 10 months ago* (last edited 10 months ago)

Lmao yeah, fair enough.

Edit: I think the important word is “accidentally “ on that first point. 😉

[-] givesomefucks@lemmy.world 5 points 10 months ago

It's not just that it circumvents the paywall, it makes up random nonsense and then claim the NYT said it.

I've never got why people don't see this about AI. When it "works" it's just spitting out what a human was paid (Avery low wage) to write, when it has to come up with something that hasn't been written, it just slaps nonsense together.

It's not real AI, it's just next generation search engines that gives unreliable results.

You just don't notice if you don't already know what you're asking.

[-] Hotzilla@sopuli.xyz 3 points 10 months ago

Even tho these LLM work by just figuring out next word (token) that makes sense, it is still able to generate things that no human has ever written before. It isn't just copypasting stuff together.

I use GPT4 daily basis on coding and the way it spills out complex code templates/snippets, which are unique to the problem, is not just not possible without model having some level of intelligence. Of course it hallucinates now and then, but so does most of the coders now and then

[-] mindbleach@sh.itjust.works 4 points 10 months ago

Never gonna happen.

The NYT might win some money based on what Microsoft published, but only to the same extent as if a human wrote that and Microsoft published it. Copyright will never be an issue for training data because training is just scanning text and guessing the next letter. Consuming an entire library to make up anything you ask for is pretty goddamn transformative.

Oh, does the model know the names of characters in a popular book? So do Google and Wikipedia. Try framing a law that's cool with Google having a whole searchable plain-text copy of a book, so it can go 'this book?' when you search for a quote, but forbids OpenAI from having the essence of that book distilled somewhere in its terabyte of inscrutable numbers.

This fight is over.

[-] Nakoichi@hexbear.net -1 points 10 months ago

yes-hahaha-yes-l
sicko-crowd

this post was submitted on 27 Dec 2023

42 points (92.0% liked)

Technology

1377 readers

100 users here now

Which posts fit here?

Anything that is at least tangentially connected to the technology, social media platforms, informational technologies and tech policy.

Rules

1. English only

Title and associated content has to be in English.

2. Use original link

Post URL should be the original link to the article (even if paywalled) and archived copies left in the body. It allows avoiding duplicate posts when cross-posting.

3. Respectful communication

All communication has to be respectful of differing opinions, viewpoints, and experiences.

4. Inclusivity

Everyone is welcome here regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.

5. Ad hominem attacks

Any kind of personal attacks are expressly forbidden. If you can't argue your position without attacking a person's character, you already lost the argument.

6. Off-topic tangents

Stay on topic. Keep it relevant.

7. Instance rules may apply

If something is not covered by community rules, but are against lemmy.zip instance rules, they will be enforced.

Companion communities

!globalnews@lemmy.zip
!interestingshare@lemmy.zip

Icon attribution | Banner attribution

founded 1 year ago

MODERATORS

BrikoX@lemmy.zip