Programming

26924 readers

354 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev

founded 2 years ago

MODERATORS

snowe@programming.dev

Ategon@programming.dev

UlrikHD@programming.dev

bugsmith@programming.dev

Spyro@programming.dev

What We Lost the Last Time Code Got Cheap (www.poppastring.com)

submitted 5 days ago by codeinabox@programming.dev to c/programming@programming.dev

63 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] ell1e@leminal.space 7 points 4 days ago* (last edited 4 days ago) (1 children)

The doctors were worse, not just taking longer. So this would be more like people unlearning division.

While people using calculators may occasionally be unlearning division, this seems less problematic than doctors unlearning how to spot cancer on their own (since then I'm guessing you won't have AI training data anymore since apparently AI can't feed into AI without collapse) or software engineers unlearning how to write correct code.

You also wouldn't want a mathematician unlearning how to do division.

[–] mindbleach@sh.itjust.works 0 points 3 days ago (1 children)

The doctors were better, until someone yanked the tool away. That's how every tool works! Even going from a handsaw to a table saw and back will make you lose some skill with the handsaw, because your brain focused on higher-level goals and finer motions. That's not proof a table saw is bad for woodworking. The problem is "and back."

since apparently AI can’t feed into AI without collapse

Have you checked on that narrative? It's been a while. Things stopped getting yellow. Improvements continued.

[–] ell1e@leminal.space 1 points 3 days ago* (last edited 3 days ago) (1 children)

Have you checked on that narrative?

The only workaround known so far seems to be to make sure enough data is fresh: https://www.inria.fr/en/collapse-ia-generatives https://en.wikipedia.org/wiki/Model_collapse But read for yourself.

That’s how every tool works!

Yes, and any tool that automates too much of the task means you'll unlearn how to do it. That can be acceptable sometimes. For writing sane software, I'd say not so much.

[–] mindbleach@sh.itjust.works 1 points 3 days ago (1 children)

That's a lot of "could" and "will" from an article a year old, primarily about concerns from two years ago, while image models to-day keep getting smaller and better. They didn't find a second internet's worth of JPEGs. Better training on the same data, or even better labels on less data, beats a simple obsession with scale.

Yes, photocopying a photocopy will degrade, but diffusion is a denoising algorithm. Un-degrading an image is its central function. 'Make it look less AI' is how you get generative adversarial networks.

Anyway, the grim truth is that the central concern is mistaken. Training data for cancer screening does not require the patient lived.

[–] ell1e@leminal.space 1 points 3 days ago* (last edited 3 days ago) (1 children)

The article links a study. What's your study that collapse isn't a concern?

For what it's worth, my worry was never focused on cancer, these doctors were just an example measured for the likely universal unlearning effect.

[–] mindbleach@sh.itjust.works 1 points 3 days ago* (last edited 3 days ago) (1 children)

I again submit the last two years where model collapse did not happen. The doom-and-gloom predictions - some rather gleeful - plainly missed the mark. The proliferation of generated content has not in fact ruined the content generators, and it's sure not because we're any good at marking generated content. Early symptoms went away entirely and the problem has been practically addressed.

As for "unlearning," universality is why it's a made-up problem. Nobody loudly complains that x-rays make doctors worse at feeling around for lumps.

[–] ell1e@leminal.space 1 points 3 days ago

https://cacm.acm.org/blogcacm/model-collapse-is-already-happening-we-just-pretend-it-isnt/ Others seem to disagree.