[-] Architeuthis@awful.systems 8 points 21 hours ago* (last edited 21 hours ago)

I could go over Wolfram's discussion of biological pattern formation, gravity, etc., etc., and give plenty of references to people who've had these ideas earlier. They have also had them better, in that they have been serious enough to work out their consequences, grasp their strengths and weaknesses, and refine or in some cases abandon them. That is, they have done science, where Wolfram has merely thought.

Huh, it looks like Wolfram also pioneered rationalism.

Scott Aaronson also turns up later for having written a paper that refutes a specific Wolfram claim on quantum mechanics, reminding us once again that very smart dumb people are actually a thing.

As a sidenote, if anyone else is finding the plain-text-disguised-as-an-html-document format of this article a tad grating, your browser probably has a reader mode that will make it way more presentable, it's F9 on firefox.

[-] Architeuthis@awful.systems 17 points 1 month ago

It had dumb scientists, a weird love conquers all theme, a bathetic climax that was also on the wrong side of believable and an extremely tacked on epilogue.

Wouldn't say that I hated it, but it was pretty flawed for what it was. magnificent black hole cgi notwithstanding.

[-] Architeuthis@awful.systems 16 points 1 month ago

Summizing Emails is a valid purpose.

Or it would have been if LLMs were sufficiently dependable anyway.

[-] Architeuthis@awful.systems 20 points 1 month ago* (last edited 1 month ago)

On each step, one part of the model applies reinforcement learning, with the other one (the model outputting stuff) “rewarded” or “punished” based on the perceived correctness of their progress (the steps in its “reasoning”), and altering its strategies when punished. This is different to how other Large Language Models work in the sense that the model is generating outputs then looking back at them, then ignoring or approving “good” steps to get to an answer, rather than just generating one and saying “here ya go.”

Every time I've read how chain-of-thought works in o1 it's been completely different, and I'm still not sure I understand what's supposed to be going on. Apparently you get a strike notice if you try too hard to find out how the chain-of-thinking process goes, so one might be tempted to assume it's something that's readily replicable by the competition (and they need to prevent that as long as they can) instead of any sort of notably important breakthrough.

From the detailed o1 system card pdf linked in the article:

According to these evaluations, o1-preview hallucinates less frequently than GPT-4o, and o1-mini hallucinates less frequently than GPT-4o-mini. However, we have received anecdotal feedback that o1-preview and o1-mini tend to hallucinate more than GPT-4o and GPT-4o-mini. More work is needed to understand hallucinations holistically, particularly in domains not covered by our evaluations (e.g., chemistry). Additionally, red teamers have noted that o1-preview is more convincing in certain domains than GPT-4o given that it generates more detailed answers. This potentially increases the risk of people trusting and relying more on hallucinated generation.

Ballsy to just admit your hallucination benchmarks might be worthless.

The newsletter also mentions that the price for output tokens has quadrupled compared to the previous newest model, but the awesome part is, remember all that behind-the-scenes self-prompting that's going on while it arrives to an answer? Even though you're not allowed to see them, according to Ed Zitron you sure as hell are paying for them (i.e. they spend output tokens) which is hilarious if true.

[-] Architeuthis@awful.systems 20 points 1 month ago

"When asked about buggy AI [code], a common refrain is ‘it is not my code,’ meaning they feel less accountable because they didn’t write it.”

Strong they cut all my deadlines in half and gave me an OpenAI API key, so fuck it energy.

He stressed that this is not from want of care on the developer’s part but rather a lack of interest in “copy-editing code” on top of quality control processes being unprepared for the speed of AI adoption.

You don't say.

[-] Architeuthis@awful.systems 18 points 1 month ago* (last edited 1 month ago)

Stephanie Sterling of the Jimquisition outlines the thinking involved here. Well, she swears at everyone involved for twenty minutes. So, Steph.

She seems to think the AI generates .WAD files.

I guess they fell victim to one of the classic blunders: never assume that it can't be that stupid, and someone must be explaining it wrong.

[-] Architeuthis@awful.systems 27 points 2 months ago

Archive the weights of the models we build today, so we can rebuild them in the future if we need to recompense them for moral harms.

To be clear, this means that if you treat someone like shit all their life, saying you're sorry to their Sufficiently Similar Simulation™ like a hundred years after they are dead makes it ok.

This must be one of the most blatantly supernatural rationalist Accepted Truths, that if your simulation is of sufficiently high fidelity you will share some ontology of self with it, which by the way is how the basilisk can torture you even if you've been dead for centuries.

[-] Architeuthis@awful.systems 17 points 3 months ago* (last edited 3 months ago)

Ah yes, Alexander's unnumbered hordes, that endless torrent of humanity that is all but certain to have made a lasting impact on the sparsely populated subcontinent's collective DNA.

edit: Also, the absolute brain on someone who would think that before entertaining a random recent western ancestor like a grandfather or whateverthefuckjesus.

[-] Architeuthis@awful.systems 18 points 3 months ago

IKR like good job making @dgerard look like King Mob from the Invisibles in your header image.

If the article was about me I'd be making Colin Robinson feeding noises all the way through.

edit: Obligatory only 1 hour 43 minutes of reading to go then

[-] Architeuthis@awful.systems 18 points 4 months ago

It hasn't worked 'well' for computers since like the pentium, what are you talking about?

The premise was pretty dumb too, as in, if you notice that a (very reductive) technological metric has been rising sort of exponentially, you should probably assume something along the lines of we're probably still at the low hanging fruit stage of R&D, it'll stabilize as it matures, instead of proudly proclaiming that surely it'll approach infinity and break reality.

There's nothing smart or insightful about seeing a line in a graph trending upwards and assuming it's gonna keep doing that no matter what. Not to mention that type of decontextualized wishful thinking is emblematic of the TREACLES mindset mentioned in the community's blurb that you should check out.

So yeah, he thought up the Singularity which is little more than a metaphysical excuse to ignore regulations and negative externalities because with tech rupture around the corner any catastrophic mess we make getting there won't matter. See also: the whole current AI debacle.

[-] Architeuthis@awful.systems 31 points 4 months ago

I'm not spending the additional 34min apparently required to find out what in the world they think neural network training actually is that it could ever possibly involve strategy on the part of the network, but I'm willing to bet it's extremely dumb.

I'm almost certain I've seen EY catch shit on twitter (from actual ml researchers no less) for insinuating something very similar.

[-] Architeuthis@awful.systems 27 points 4 months ago* (last edited 4 months ago)

It's a sad fate that sometimes befalls engineers who are good at talking to audiences, and who work for a big enough company that can afford to have that be their primary role.

edit: I love that he's chief evangelist though, like he has a bunch of little google cloud clerics running around doing chores for him.

76

AI Work Assistants Need a Lot of Handholding

Getting full value out of AI workplace assistants is turning out to require a heavy lift from enterprises. ‘It has been more work than anticipated,’ says one CIO.

aka we are currently in the process of realizing we are paying for the privilege of being the first to test an incomplete product.

Mandell said if she asks a question related to 2024 data, the AI tool might deliver an answer based on 2023 data. At Cargill, an AI tool failed to correctly answer a straightforward question about who is on the company’s executive team, the agricultural giant said. At Eli Lilly, a tool gave incorrect answers to questions about expense policies, said Diogo Rau, the pharmaceutical firm’s chief information and digital officer.

I mean, imagine all the non-obvious stuff it must be getting wrong at the same time.

He said the company is regularly updating and refining its data to ensure accurate results from AI tools accessing it. That process includes the organization’s data engineers validating and cleaning up incoming data, and curating it into a “golden record,” with no contradictory or duplicate information.

Please stop feeding the thing too much information, you're making it confused.

Some of the challenges with Copilot are related to the complicated art of prompting, Spataro said. Users might not understand how much context they actually need to give Copilot to get the right answer, he said, but he added that Copilot itself could also get better at asking for more context when it needs it.

Yeah, exactly like all the tech demos showed -- wait a minute!

[Google Cloud Chief Evangelist Richard Seroter said] “If you don’t have your data house in order, AI is going to be less valuable than it would be if it was,” he said. “You can’t just buy six units of AI and then magically change your business.”

Nevermind that that's exactly how we've been marketing it.

Oh well, I guess you'll just have to wait for chatgpt-6.66 that will surely fix everything, while voiced by charlize theron's non-union equivalent.

1

An AI company has been generating porn with gamers' idle GPU time in exchange for Fortnite skins and Roblox gift cards

"some workloads may generate images, text or video of a mature nature", and that any adult content generated is wiped from a users system as soon as the workload is completed.

However, one of Salad's clients is CivitAi, a platform for sharing AI generated images which has previously been investigated by 404 media. It found that the service hosts image generating AI models of specific people, whose image can then be combined with pornographic AI models to generate non-consensual sexual images.

Investigation link: https://www.404media.co/inside-the-ai-porn-marketplace-where-everything-and-everyone-is-for-sale/

0
submitted 7 months ago* (last edited 7 months ago) by Architeuthis@awful.systems to c/sneerclub@awful.systems

For thursday's sentencing the us government indicated they would be happy with a 40-50 prison sentence, and in the list of reasons they cite there's this gem:

  1. Bankman-Fried's effective altruism and own statements about risk suggest he would be likely to commit another fraud if he determined it had high enough "expected value". They point to Caroline Ellison's testimony in which she said that Bankman-Fried had expressed to her that he would "be happy to flip a coin, if it came up tails and the world was destroyed, as long as if it came up heads the world would be like more than twice as good". They also point to Bankman-Fried's "own 'calculations'" described in his sentencing memo, in which he says his life now has negative expected value. "Such a calculus will inevitably lead him to trying again," they write.

Turns out making it a point of pride that you have the morality of an anime villain does not endear you to prosecutors, who knew.

Bonus: SBF's lawyers' list of assertions for asking for a shorter sentence includes this hilarious bit reasoning:

They argue that Bankman-Fried would not reoffend, for reasons including that "he would sooner suffer than bring disrepute to any philanthropic movement."

0
submitted 8 months ago* (last edited 8 months ago) by Architeuthis@awful.systems to c/sneerclub@awful.systems

rootclaim appears to be yet another group of people who, having stumbled upon the idea of the Bayes rule as a good enough alternative to critical thinking, decided to try their luck in becoming a Serious and Important Arbiter of Truth in a Post-Mainstream-Journalism World.

This includes a randiesque challenge that they'll take a $100K bet that you can't prove them wrong on a select group of topics they've done deep dives on, like if the 2020 election was stolen (91% nay) or if covid was man-made and leaked from a lab (89% yay).

Also their methodology yields results like 95% certainty on Usain Bolt never having used PEDs, so it's not entirely surprising that the first person to take their challenge appears to have wiped the floor with them.

Don't worry though, they have taken the results of the debate to heart and according to their postmortem blogpost they learned many important lessons, like how they need to (checks notes) gameplan against the rules of the debate better? What a way to spend 100K... Maybe once you've reached a conclusion using the Sacred Method changing your mind becomes difficult.

I've included the novel-length judges opinions in the links below, where a cursory look indicates they are notably less charitable towards rootclaim's views than their postmortem indicates, pointing at stuff like logical inconsistencies and the inclusion of data that on closer look appear basically irrelevant to the thing they are trying to model probabilities for.

There's also like 18 hours of video of the debate if anyone wants to really get into it, but I'll tap out here.

ssc reddit thread

quantian's short writeup on the birdsite, will post screens in comments

pdf of judge's opinion that isn't quite book length, 27 pages, judge is a microbiologist and immunologist PhD

pdf of other judge's opinion that's 87 pages, judge is an applied mathematician PhD with a background in mathematical virology -- despite the length this is better organized and generally way more readable, if you can spare the time.

rootclaim's post mortem blogpost, includes more links to debate material and judge's opinions.

edit: added additional details to the pdf descriptions.

view more: next ›

Architeuthis

joined 1 year ago