scruiser

joined 2 years ago
[–] scruiser@awful.systems 4 points 4 months ago* (last edited 4 months ago) (1 children)

There are techniques for caching some of the steps involved with LLMs. Like I think you can cache the tokenization and maybe some of the work of the attention head is doing if you have a static, known, prompt? But I don't see why you couldn't just do that caching separately for each model your model router might direct things to? And if you have multiple prompts you just do a separate caching for each one? This creates a lot of memory usage overhead, but not more excessively more computation... well you do need to do the computation to generate each cache. I don't find it that implausible that OpenAI couldn't manage to screw all this up somehow, but I'm not quite sure the exact explanation of the problem Zitron has given fits together.

(The order of the prompts vs. user interactions does matter, especially for caching... but I think you could just cut and paste the user interactions to separate it from the old prompt and stick a new prompt on it in whatever order works best? You would get wildly varying quality in output generated as it switches between models and prompts, but this wouldn't add in more computation...)

Zitron mentioned a scoop, so I hope/assume someone did some prompt hacking to get GPT-5 to spit out some of it's behind the scenes prompts and he has solid proof about what he is saying. I wouldn't put anything past OpenAI for certain.

[–] scruiser@awful.systems 6 points 4 months ago

If they got a lot of usage out of a model this constant cost would contribute little to the cost of each model in the long run... but considering they currently replace/retrain models every 6 months to 1 year, yeah this cost should be factored in as well.

Also, training compute grows quadratically with model size, because its is a multiple of training data (which grows linearly with model size) and the model size.

[–] scruiser@awful.systems 4 points 4 months ago

Even bigger picture... some standardized way of regularly handling possible combinations of letters and numbers that you could use across multiple languages. Like it handles them as expressions?

[–] scruiser@awful.systems 6 points 4 months ago (4 children)

I know like half the facts I would need to estimate it... if you know the GPU vRAM required for the video generation, and how long it takes, then assuming no latency, you could get a ballpark number looking at nVida GPU specs on power usage. For instance, if a short clip of video generation needs 90 GB VRAM, then maybe they are using an RTX 6000 Pro... https://www.nvidia.com/en-us/products/workstations/professional-desktop-gpus/ , take the amount of time it takes in off hours which shouldn't have a queue time... and you can guessestimate a number of Watt hours? Like if it takes 20 minutes to generate, then at 300-600 watts of power usage that would be 100-200 watt hours. I can find an estimate of $.33 per kWh (https://www.energysage.com/local-data/electricity-cost/ca/san-francisco-county/san-francisco/ ), so it would only be costing $.03 to $.06.

IDK how much GPU-time you actually need though, I'm just wildly guessing. Like if they use many server grade GPUs in parallel, that would multiply the cost up even if it only takes them minutes per video generation.

[–] scruiser@awful.systems 5 points 4 months ago

promptfarmers, for the "researchers" trying to grow bigger and bigger models.

/r/singularity redditors that have gotten fed up with Sam Altman's bs often use Scam Altman.

I've seen some name calling using drug analogies: model pushers, prompt pushers, just one more training run bro (for the researchers); just one more prompt (for the users), etc.

[–] scruiser@awful.systems 8 points 4 months ago

I could imagine a lesswronger being delusional/optimistic enough to assume their lesswrong jargon concepts have more academic citations than a handful of arXiv preprints... but in this case they just admitted otherwise their only sources are lesswrong and arXiv. Also, if they know wikipedia's policies, they should no the No Original Research rule would block their idea even overlooking single source and conflict of interest.

[–] scruiser@awful.systems 8 points 4 months ago (1 children)

Yeah that article was one of the things I had mind. It's the peak of centrist liberalism where EAs and lesswrongers can think these people are literally going to cause mankind's extinction (or worse) and they can't even bring themselves to be rude to them. OTOH, if they actually acted coherently on their nominal doomer beliefs, they would be carrying out terrorism on a far greater scale than the Zizians, so maybe it is for the best they are ideologically incapable of direct action.

[–] scruiser@awful.systems 13 points 4 months ago (7 children)

Yall ready for another round of LessWrong edit wars on Wikipedia? This time with a wider list of topics!

https://www.lesswrong.com/posts/g6rpo6hshodRaaZF3/mech-interp-wiki-page-and-why-you-should-edit-wikipedia-1

On the very slightly merciful upside... the lesswronger recommends "If you want to work on a new page, discuss with the community first by going to the talk page of a related topic or meta-page." and "In general, you shouldn't post before you understand Wikipedia rules, norms, and guidelines." so they are ahead of the previous calls made on Lesswrong for Wikipedia edit-wars.

On the downside, they've got a laundry list of lesswrong jargon they want Wikipedia articles for. Even one of the lesswrongers responding to them points out these terms are a bit on the under-defined side:

Speaking as a self-identified agent foundations researcher, I don't think agent foundations can be said to exist yet. It's more of an aspiration than a field. If someone wrote a wikipedia page for it, it would just be that person's opinion on what agent foundations should look like.

[–] scruiser@awful.systems 6 points 4 months ago (6 children)

They’re cosplaying as activists, have no ideas about how to move the public image needle other than weird movie ideas and hope, and are literally marinated in SV technolibertarianism which sees government regulation as Evil.

It is kind of sad. They are missing the ideological pieces that would let them carry out activism effectually so instead they've gotten used as a free source of crit-hype in the LLM bubble. ...except not that sad because they would ignore real AI dangers in favor of their sci-fi scenarios, so I don't feel too bad for them.

[–] scruiser@awful.systems 6 points 4 months ago (1 children)

And why would a rich guy be against a “we are trying to convince rich guys to spend their money differently” organization.

Well when they are just passively trying to convince the rich guys, they can use the organization to launder reputation or boost ideologies they are in favor of. When the organization actually tries to get regulations passed, even ineffectually, well, that is a threat to the likes of Thiel.

[–] scruiser@awful.systems 7 points 4 months ago* (last edited 4 months ago)

The quirky eschatologist that you’re looking for is René Girard, who he personally met at some point. For more details, check out the Behind the Bastards on him.

Thanks for the references. The quirky theology was so outside the range of even the weirder Fundamentalist Christian stuff I didn't recognize it as such. (And didn't trust the EA summary because they try so hard to charitably make sense of Thiel).

In this context, Thiel fears the spectre of AGI because it can’t be influenced by his normal approach to power, which is to hide anything that can be hidden and outspend everybody else talking in the open.

Except the EAs are, on net, opposed to the creation of AGI (albeit they are ineffectual in their opposition). So going after the EAs doesn't make sense if Thiel is genuinely opposed to inventing AGI faster. So I still think Thiel is just going after the EA's because he's libertarian and EA has shifted in the direction of trying to get more government regulation. (As opposed to a coherent theological goal beyond libertarianism). I'll check out the BtB podcast and see if it changes my mind as to his exact flavor of insanity.

view more: ‹ prev next ›