this post was submitted on 28 Jan 2025
954 points (98.0% liked)

Microblog Memes

6320 readers
4192 users here now

A place to share screenshots of Microblog posts, whether from Mastodon, tumblr, ~~Twitter~~ X, KBin, Threads or elsewhere.

Created as an evolution of White People Twitter and other tweet-capture subreddits.

Rules:

  1. Please put at least one word relevant to the post in the post title.
  2. Be nice.
  3. No advertising, brand promotion or guerilla marketing.
  4. Posters are encouraged to link to the toot or tweet etc in the description of posts.

Related communities:

founded 2 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] Ilovethebomb@lemm.ee 0 points 2 days ago (3 children)

I do feel deeply suspicious about this supposedly miraculous AI, to be fair. It just seems too amazing to be true.

[–] HK65@sopuli.xyz 23 points 2 days ago (1 children)

You can run it yourself, so that rules out it's just Indian people like the Amazon no checkout store was.

Other than that, yeah, be suspicious, but OpenAI models have way more weird around them than this company.

I suspect that OpenAI and the rest just weren't doing research into less costs because it makes no financial sense for them. As in it's not a better model, it's just easier to run, thus it makes it easier to catch up.

[–] Ilovethebomb@lemm.ee 1 points 2 days ago (2 children)

Mostly, I'm suspicious about how honest the company is being about the cost to train the model, that's one thing that is very difficult to verify.

[–] HK65@sopuli.xyz 8 points 2 days ago (1 children)

Does it matter though? It's not like you will train it yourself, and US companies are also still in the dumping stage.

[–] Ilovethebomb@lemm.ee 2 points 2 days ago

It does, because the reason the US stock market has lost a billion dollars in value is because this company can supposedly train an AI for cents on the dollar compared to what a US company can do.

It seems to me that understating the cost and complexity of training would cause a lot of problems to the states.

[–] Binette@lemmy.ml 1 points 1 day ago

They explained how you can train the model in order to create a similar A.I. on their white paper: https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf

[–] nova_ad_vitum@lemmy.ca 12 points 2 days ago* (last edited 2 days ago) (1 children)

It's open source and people are literally self-hosting it for fun right now. Current consensus appears to be that its not as good as chatGPT for many things. I haven't personally tried it yet. But either way there's little to be "suspicious" about since it's self-hostable and you don't have to give it internet access at all so it can't call home.

https://www.reddit.com/r/selfhosted/comments/1ic8zil/yes_you_can_run_deepseekr1_locally_on_your_device/

[–] Ilovethebomb@lemm.ee 2 points 2 days ago (1 children)

Is there any way to verify the computing cost to generate the model though? That's the most shocking claim they've made, and I'm not sure how you could verify that.

[–] Zos_Kia@lemmynsfw.com 3 points 1 day ago

If you take into account the optimizations described in the paper, then the cost they announce is in line with the rest of the world's research into sparse models.

Of course, the training cost is not the whole picture, which the DS paper readily acknowledges. Before arriving at 1 successful model you have to train and throw away n unsuccessful attempts. Of course that's also true of any other LLM provider, the training cost is used to compare technical trade-offs that alter training efficiency, not business models.

[–] ItJustDonn@slrpnk.net 6 points 2 days ago (1 children)

Open source means it can be publicly audited to help soothe suspicion, right? I imagine that would take time, though, if it's incredibly complex

[–] stsquad@lemmy.ml 14 points 2 days ago (1 children)

Open source is a very loose term when it comes to GenAI. Like Llama the weights are available with few restrictions but importantly how it was trained is still secret. Not being reproducible doesn't seem very open to me.

[–] noscere@sh.itjust.works 4 points 2 days ago (1 children)

True, but in this case I believe the also open sourced the training data and the training process.

[–] stsquad@lemmy.ml 5 points 2 days ago

Their paper outlines the training process but doesn't supply the actual data or training code. There is a project on huggingface: https://huggingface.co/blog/open-r1 that is attempting a fully open recreation based on what is public.