view the rest of the comments
Technology
This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.
Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.
Rules:
1: All Lemmy rules apply
2: Do not post low effort posts
3: NEVER post naziped*gore stuff
4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.
5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)
6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist
7: crypto related posts, unless essential, are disallowed
Let's remove the context of AI altogether.
Say, for instance, you were to check out and read a book from a free public library. You then go on to use some of the book's content as the basis of your opinions. More, you also absorb some of the common language structures used in that book and unwittingly use them on your own when you speak or write.
Are you infringing on copyright by adopting the book's views and using some of the sentence structures its author employed? At what point can we say that an author owns the language in their work? Who owns language, in general?
Assuming that a GPT model cannot regurgitate verbatim the contents of its training dataset, how is copyright applicable to it?
Edit: I also would imagine that if we were discussing an open source LLM instead of GPT-4 or GPT-3.5, sentiment here would be different. And more, I imagine that some of the ire here stems from a misunderstanding of how transformer models are trained and how they function.
Yeah sure if you do that then you can say anything. But the context is crucial. Imagine that you could prove in court that I went down to the public library with a list that read "Books I want to read for the express purpose of mimicking, and that I get nothing else out of", and on that list was your book. Imagine you had me on tape saying that for me writing is not a creative expression of myself, but rather I am always trying to find the word that the authors I have studied would use. Now that's getting closer to the context of AI. I don't know why you think you would need me to sell verbatim copies of your book to have a good case against me. Just a few passages should suffice given my shady and well-documented intentions.
Well that's basically what LLMs look like to me.