this post was submitted on 01 Aug 2023

18 points (90.9% liked)

LocalLLaMA

2249 readers

1 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 1 year ago

MODERATORS

SkySyrup@sh.itjust.works

pax@sh.itjust.works

noneabove1182@sh.itjust.works

(Deleted for not relevant anymore) (piped.video)

submitted 1 year ago* (last edited 3 weeks ago) by cll7793@lemmy.world to c/localllama@sh.itjust.works

26 comments fedilink hide all child comments

(Deleted for not relevant anymore)

you are viewing a single comment's thread
view the rest of the comments

[–] paysrenttobirds@sh.itjust.works 3 points 1 year ago (2 children)

What do they mean by watermarks? Why is it a bad idea to know which, if any, ai has produced something?

Thanks for the post

[–] cll7793@lemmy.world 3 points 1 year ago* (last edited 1 year ago) (2 children)

They are requesting for something beyond watermarking. Yes, it is good to have a robot tell you when it is making a film. What is particularly concerning is that the witnesses want the government to keep track of every prompt and output ever made to eventually be able to trace its origin. So all open source models must somehow encode some form of signature, much like the hidden yellow dots printers produce on every sheet.

There is a huge difference between a watermark stating that "this is ai generated" and having hidden encodings, much like a backdoor, where they can trace any pubicly released ai image, video, and perhaps even text output, to some specific model, or worse DRM required "yellow dot" injection.

I know researchers have already looked into encoding hidden undetectable patterns in text output, so an extension to everything else is not unjustified.

Also, if the encodings are not detectable by humans, then they have failed the original purpose of making ai generated content known.

[–] h3ndrik@feddit.de 2 points 1 year ago* (last edited 1 year ago) (1 children)

I think the argumentation is several logical fallacies at once. And it's not either / or.

I don't see a reason why OpenAI and the other big companies shouldn't have incorporated watermarks from the beginning and voluntarily. The science is out there and it's really simple to do. And it solves a few valid problems.

I think valid uses are to find out if your pupils did their homework themselves, to fight spam and misinformation. There is no need to incorporate all kinds of data into the watermark to establish your surveillance fantasies and on the other hand it's stupid to say: "but it can be circumvented" or doesn't work in edge-cases and then don't do it at all. That's not a valid argument. You could say it disadvantages me if I have to do it but my competitors don't... But that's hardly the case if you're advertising to other people than criminals.

On a broader level, transparency is a good thing, if done right. I wouldn't like some AI driven dystopian future with intransparent social scores, credit scores and my CV being declined before some human reads it. However, we need to be able to use AI as a tool. Even for use cases like that. Transparency is the first step.

[–] Meowoem@sh.itjust.works 0 points 1 year ago (1 children)

There's no technology that can embed a watermark into a paragraph of text without being obviously removable and degrading the quality of the text - it's also pointless.

The solution for schools worried about essays being written by chat GPT is to teach the kids how to use chat GPT and to integrate its use in writing essays - have you ever heard a teacher complain that their students might be getting answers from an encyclopedia? Or that they used a calculator to help solve their algebra homework? Do art teachers complain about rulers and reference images? No they teach the skills using available technology, you'd laugh and get angry if someone wanted to install spyware to meet sure you haven't used spell check when writing you thesis and this is the same thing.

Yes this means teachers can't set the same essay question and mark scheme they've used for a decade or four and will have to come up with something that takes account for the new technology, that allows the student to show understanding of the subject and tools available, and which are able to be marked based on the students ability rather than their choice of tool.

[–] h3ndrik@feddit.de 2 points 1 year ago* (last edited 1 year ago) (1 children)

There’s no technology that can embed a watermark into a paragraph of text without being obviously removable

My point was: Exactly that is not a valid argument. This should not stop us doing the right thing in 95% of the cases and in the large commercial deployments that most people use.

and degrading the quality of the text

The paper A Watermark for Large Language Models says it has "negligible impact on text quality".

have you ever heard a teacher complain that their students might be getting answers from an encyclopedia?

That was the time i went to school. For a while we could just print wikipedia articles and be done with our presentations. It worked for a while especially with the older teachers that weren't yet aware of wikipedia. Fun times, homework oftentimes done in 5 minutes.

Or that they used a calculator

I'm starting to believe we grew up in different times/cultures. We were allowed to buy a calculator -i think- in grade 11. But our teacher did not allow to use it during tests for -i think- another year. And during that time you'd better keep up practicing calculating with your brain only or you'd be fucked getting everything done in time during the exams. I think I was able to use that calculator for about 1,5 to 2 years in school. And then of course in uni in most courses.

The thing is... When learning things: You need to learn the basics first. You need to grow an understanding of why something works. What happens in the background. What your tools actually do. If you give people powerful tools too early, they won't learn the concepts behind what they're doing. The tool will do that for them and they will only learn how to operate that specific tool.

Edit: And that's the right thing to do. It's the difference between a monkey pushing buttons and someone with a profound understanding of a topic. You want a proper education, otherwise you're obsolete at the point where someone invents a new tool that works not like the tool you're used to. Or you want to explore something new and no-one wrote an encyclopedia-article for you.

[–] Meowoem@sh.itjust.works -1 points 1 year ago (1 children)

Well Wikipedia wasn't invented when I went to school and our teachers were very keen on us using encyclopedias and calculators - maybe your tech regressive attitude is cultural.

If you think using a calculator is just pressing buttons and cheating then you'd have to have stopped at a very basic level not much more advanced than basic multiplication and division - likewise gpt, if you think it's just pressing a button and getting the answer then you're using it for very simple tasks, certainly not scratching the surface of it's or your own capabilities. A skilled and engaged user doing extra steps is getting very different results to how you say it's used, this is the difference between a low graded homework and an A+ piece of work.

[–] h3ndrik@feddit.de 1 points 1 year ago* (last edited 1 year ago) (1 children)

I think we're talking past each other here.

Prompt engineering language models and knowing how the arc of suspense is supposed to work in a novella are two entirely different things and skill sets. It kind of depends what you're trying to teach.

Are you able to calculate if a large pizza is more expensive or cheaper than two small pizzas just with a calculator, without storing basic concepts about how circles work inside of your brain?

Having knowledge about concepts, being literate and able to connect thoughts is what makes you smart. And things add up once problems start to become more difficult than mere examples. Try and be a philosopher without reading anything about Adorno, Kant and the ancient greeks because you "can look it up".... Using a calculator or encyclopedia and modern computer tools is the 5% on top that makes you fast and excel at things. 95% is hard work. And that is why I think focusing on teaching it that way is the right thing to do. And then add the 5% on top. Just don't skip that like my teachers sometimes did. Background knowledge is important to have. So are applied skills and to know how to use your tool kit.

[–] Meowoem@sh.itjust.works 0 points 1 year ago

Exactly, and using modern tools like calculators is what allows people to focus on learning background knowledge and theory - you can easily use computer to determine the area of a two circles and the price of each per square centimetre without having to remember formula or do mental arithmetic, someone who does it by hand is going to take longer thus giving the other the advantage of being able to do far more complex questions in the same amount of time - like comparing the calorific intake from various sizes and topped pizzas and constructing a nice graph or table to show results.

The truth is we currently accept a very low quality in everything right from kids homework to media reporting on politics, when we adapt to using AI tools to help construct articles we'll start seeing much better made augments and much better analysis - things like actual fact checking will become the norm instead of a six month project that blows everyone's minds but then gets forgotten.

Imagine a world where journalists job isn't to string pretty words together but to get stories and give them context, where experts opinions get included rather than overlooked because the person writing the article simply has no idea there's a whole scientific body that studies the field and instead blindly trusts some corporate spokespersons press release, where a journalist doesn't have to spend thirty hours reading through archives trying to determine if the subject of his story has related history but can simply say 'give a detailed breakdown of accusations of safety violations from Dupont'

Of course there will still be a lot of writing to do, not in the key pressing way where you waste an hour trying to think of a good word to describe butter beans but looking at paragraphs and saying 'that's a bit dense, split the bit about shell poisoning the Niger Delta into it's own paragraph then add a short summary of the economic cost from the data we were looking at in section 1'

Being able to focus on the important things will make us able to produce better stuff - schools that teach how to use AI are going to make students who are able to compete and contribute in the modern world, schools that try to force their students to live in 1990 are going to produce kids that've already been left behind.

[–] paysrenttobirds@sh.itjust.works 2 points 1 year ago (1 children)

Thanks for the details. I guess the next step is to contact my congresspeople :)

[–] cll7793@lemmy.world 1 points 1 year ago

I will do the same. No problem! I'm very happy that my post was heard! Thank you!!!

[–] cll7793@lemmy.world 1 points 1 year ago

Also no problem! I feel like I had to share this one.