AtmosphericRiversCuomo

joined 9 months ago

The paper details more about how they got these things to train each other, so in theory you'd train these tiny 1.5b or small 8b models to specific tasks.

[–] AtmosphericRiversCuomo@hexbear.net 5 points 6 months ago* (last edited 6 months ago) (2 children)

Cost means energy use in addition to hardware, so it should matter to climate concerned folks, not just investors.

Also these reasoning models are way less likely to hallucinate given time to think (test time compute). Think applications where it matters like augmenting or replacing existing labor, which will start to happen more.

[–] AtmosphericRiversCuomo@hexbear.net 3 points 6 months ago (2 children)

That's mind boggling output for an 8b model compared to last year. It's funny, you can def see how that first answer is a distillation of r1's answer here:

spoilerOkay, so the user is asking how dumb it is to buy $500 billion in Nvidia stock. Let me break this down. First, $500 billion is an astronomical amount of money. Nvidia's current market cap is around $2.2 trillion as of early 2023. If someone tried to buy $500 billion worth of their stock, that's like almost a quarter of the entire company. But wait, market cap is the total value of all outstanding shares. So buying $500 billion would require that many shares being available, which they might not be.

Liquidity is a big factor here. Even if the shares are available, purchasing such a huge volume would drive the price up rapidly because of the increased demand. This is known as slippage. The more you buy, the higher the price goes, so the actual cost could be way more than $500 billion. Plus, once the market notices such a massive buy, other investors might react, maybe causing a short-term spike but also potential volatility afterward.

Then there's diversification. Putting all that money into a single stock is super risky. Even if Nvidia is a strong company, any company-specific issue—like a product failure, regulatory problem, or market shift—could tank the stock. Diversification helps mitigate that risk, but this move is the opposite of diversified.

It means we're fucking lucky that the people who would probably enslave us if they could don't have any secret technological advantage thanks to these open source chads.

[–] AtmosphericRiversCuomo@hexbear.net 4 points 6 months ago (4 children)

Yes, comparable performance to a multi billion dollar company for a fraction of the cost for training and inference. It cost $5 million to train their prior model. LLama 3 took an estimated 750 million to 1 billion to train and it's not as good.

[–] AtmosphericRiversCuomo@hexbear.net 6 points 6 months ago (2 children)

I don't disagree at all that these companies will produce little results besides grifting, but that's because they don't have a moat of any kind and they've admitted as much.

[–] AtmosphericRiversCuomo@hexbear.net 4 points 6 months ago (4 children)

The announcement from deepseek was sucking the oxygen out of this entire dumb stargate announcement imo. It's like they're already wrong but committed. the amount I think demonstrates that they feel they MUST win this race at any cost, even if it will just mostly end up being a grift.

[–] AtmosphericRiversCuomo@hexbear.net 11 points 6 months ago (17 children)

60 year old dipshits are impressed by chat gpt talking cause that's what they do for work, they immediately think they have solved labor

I'm sorry, I know this is a popular line of thinking around here, but this is off base.

What's actually happening is that humanity may very well have stumbled into discovering machine reasoning on a level that we haven't even realized yet.

The SOTA in this field as of a couple days ago is a Chinese open source project called Deepseek that just upended the entire industry and pantsed Sam Altman and Silicon Valley in front of the world. They're giving away the secret to their massively efficient results that are besting every benchmark, and have cracked a breakthrough unsupervised learning technique that is allowing these models to demonstrate powerful reasoning capabilities. This stuff isn't Google's botched AI telling you to smoke while pregnant, this is post-graduate mathematics.

This has been great for a couple reasons: 1. we get a lot more performance for less cost and power usage which is seriously bloomer news, and 2. it's been hilarious watching western chauvinists melt down over this massive flex out of China.

Read a good synopsis of the paper that just dropped if you're still thinking it's all a grift. I also hate a lot of the practitioners of this stuff as well as the additional strain on the grid and climate, but I'm telling you folks, this tech is legit and currently improving at an alarming rate!

[–] AtmosphericRiversCuomo@hexbear.net 30 points 6 months ago* (last edited 6 months ago)

There isn't a militant communist movement to crush though, so what are they going to do? Arrest every radlib with a hammer and sickle in their X bio?

I don't think it will break along traditionally leftist/fascist lines in the US anyway. It'll just end up being a violent and chaotic mass insurrection at some point like BLM in 2020 x 10.

e: I should say I don't disagree with you that we're headed for even more fascism, but it'll be different than we expect.

[–] AtmosphericRiversCuomo@hexbear.net 34 points 6 months ago (6 children)

One positive note, is that there will almost certainly be more leftists when all is said and done. Leftists are forged in a kiln of material hardship that these idiots will unintentionally cause.

Because selective enforcement of the law alone is massively powerful.

[–] AtmosphericRiversCuomo@hexbear.net 1 points 6 months ago (2 children)

They also have complete control of the towns they inhabit including the police and judiciary to shield them from state efforts to stop them

Wish that was us

view more: ‹ prev next ›