this post was submitted on 16 Feb 2026
112 points (100.0% liked)

technology

24242 readers
569 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Infamousblt@hexbear.net 24 points 1 day ago* (last edited 1 day ago) (2 children)

This sounds damning but also doesn't mean a lot. Many companies go many many years building things before seeing a "return" on it. They make money but they are making less than they're burning in VC funds, and as they start approaching the break even point, they use that to go get more VC funds to burn to keep expanding. This is largely how the tech industry works. Very very few companies are cash flow positive during their growth phases.

It does mean that there is a risk in this investment because from a business perspective AI hasn't been proven as a valuable investment yet but it still might for at least some of these companies and unless we really do run into the physics issues with AI with regards to data center capacity and build rate it could take a decade or more for this "we aren't seeing a return yet" thing to matter

[–] yogthos@lemmygrad.ml 16 points 22 hours ago (1 children)

I agree, it's basically a completely new tool looking for a market fit. It's also worth noting that these companies are basically looking for one stellar application. If they hit on something that works really well, that's gonna be the business model. So, they're perfectly fine with most of the pilots failing if they can find one that works well.

That said, I do think there is a bubble where a lot of companies are implementing these tools without having a good fit for them, and there's a ton of money being wasted in the process. It's kind of the same thing we saw with the dotCom bubble. When it popped, there was an extinction event where most companies went belly up, but we got a ton of useful tech out of it that underpins the internet today.

I expect we'll see a similar thing happen with AI. Except, this time around there's another factor which is that there's direct competition from China. My prediction is that Chinese models will win in the end because Chinese companies aren't looking for direct monetization, they're treating models as infrastructure, sort of what we see with Linux. Most companies don't try to monetize it directly, they build stuff like AWS on top of it and that becomes the product.

I expect American companies are just going to run out of runway in the near future, and they're also getting squeezed by cheap Chinese models that are also open source. Big companies prefer running stuff on prem because they can keep their data private that way, and they can tune the models any way they want. Meanwhile, stuff like DeepSeek is orders of magnitude cheaper than Claude for individual use. So I just don't see a long term business model for models as a service, especially not at pricing like Anthropic or even Google. Vast majority of people aren't gonna pay 20 bucks a month for this stuff, let alone a 100.

[–] darkmode@hexbear.net 9 points 21 hours ago* (last edited 21 hours ago) (1 children)

Most companies don't try to monetize it directly, they build stuff like AWS on top of it and that becomes the product.

My company has an idea like this going rn but still charges per token bc it uses the vc funded companies services instead of having their own models

[–] yogthos@lemmygrad.ml 5 points 14 hours ago (1 children)

For business customers per token costs might not be a deal breaker, but for anything consumer facing it's a really tough sell in my opinion. I do expect that the cost of running models is going to come down significantly in the near future though. There is a whole bunch of recent research that identify some key optimizations that can be made. Some of the ones I've found particularly interesting here:

Once these ideas start getting integrated, I expect that we'll see much more capable models that can run on fairly cheap hardware. Even local models will likely be quite capable for a lot of tasks. And at that point running a model as a service and charging per token is going to be a dead end.

[–] darkmode@hexbear.net 2 points 9 hours ago (1 children)

this is an incredible list of research. TYSM! In spare work time i have a small tool that tries to accomplish what #2 describes i have not clicked the link and read yet but now i will read everything

[–] yogthos@lemmygrad.ml 3 points 8 hours ago (1 children)

I played around with implementing the recursive language model paper, and that actually turned out pretty well https://git.sr.ht/~yogthos/matryoshka

Basically, I spin up a js repl in a sandbox, and the agent can feed files into it, and then run commands against them. What normally happens is that the agent has to ingest the whole file into its context, but now it can just shove files into the repl, and then do operations on them akin to a db. And it can create variables. For example, if it searches for something in a file, it can bind the result to a variable and keep track of it. If it needs to filter the search later, it can just reference the variable it already made. This saves a huge amount of token use, and also helps the model stay more focused.

[–] darkmode@hexbear.net 1 points 2 hours ago (1 children)

about how large are the codebases you’ve used this rlm with

[–] yogthos@lemmygrad.ml 2 points 19 minutes ago

Around around 10k lines or so. I use it as MCP that the agent uses when it decides it needs to. The whole code base doesn't get loaded in the repl, just individual files as it searches through them.

[–] GoodGuyWithACat@hexbear.net 11 points 22 hours ago

But it isn't a niche area, it's the major focus of corporate investment for the last year or two.