42
submitted 10 months ago by BrikoX@lemmy.zip to c/technology@lemmy.zip
all 10 comments
sorted by: hot top controversial new old
[-] charonn0@startrek.website 10 points 10 months ago

I'm sympathetic to the NYT, even if it's not reproducing their IP verbatim.

AI companies need to acknowledge that their LLMs would be worthless without training data and compensate/credit the sources appropriately.

[-] givesomefucks@lemmy.world 5 points 10 months ago

It's not just that it circumvents the paywall, it makes up random nonsense and then claim the NYT said it.

I've never got why people don't see this about AI. When it "works" it's just spitting out what a human was paid (Avery low wage) to write, when it has to come up with something that hasn't been written, it just slaps nonsense together.

It's not real AI, it's just next generation search engines that gives unreliable results.

You just don't notice if you don't already know what you're asking.

[-] Hotzilla@sopuli.xyz 3 points 10 months ago

Even tho these LLM work by just figuring out next word (token) that makes sense, it is still able to generate things that no human has ever written before. It isn't just copypasting stuff together.

I use GPT4 daily basis on coding and the way it spills out complex code templates/snippets, which are unique to the problem, is not just not possible without model having some level of intelligence. Of course it hallucinates now and then, but so does most of the coders now and then

[-] mindbleach@sh.itjust.works 4 points 10 months ago

Never gonna happen.

The NYT might win some money based on what Microsoft published, but only to the same extent as if a human wrote that and Microsoft published it. Copyright will never be an issue for training data because training is just scanning text and guessing the next letter. Consuming an entire library to make up anything you ask for is pretty goddamn transformative.

Oh, does the model know the names of characters in a popular book? So do Google and Wikipedia. Try framing a law that's cool with Google having a whole searchable plain-text copy of a book, so it can go 'this book?' when you search for a quote, but forbids OpenAI from having the essence of that book distilled somewhere in its terabyte of inscrutable numbers.

This fight is over.

[-] Nakoichi@hexbear.net -1 points 10 months ago
this post was submitted on 27 Dec 2023
42 points (92.0% liked)

Technology

1377 readers
100 users here now

Which posts fit here?

Anything that is at least tangentially connected to the technology, social media platforms, informational technologies and tech policy.


Rules

1. English onlyTitle and associated content has to be in English.
2. Use original linkPost URL should be the original link to the article (even if paywalled) and archived copies left in the body. It allows avoiding duplicate posts when cross-posting.
3. Respectful communicationAll communication has to be respectful of differing opinions, viewpoints, and experiences.
4. InclusivityEveryone is welcome here regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
5. Ad hominem attacksAny kind of personal attacks are expressly forbidden. If you can't argue your position without attacking a person's character, you already lost the argument.
6. Off-topic tangentsStay on topic. Keep it relevant.
7. Instance rules may applyIf something is not covered by community rules, but are against lemmy.zip instance rules, they will be enforced.


Companion communities

!globalnews@lemmy.zip
!interestingshare@lemmy.zip


Icon attribution | Banner attribution

founded 1 year ago
MODERATORS