… tests from earlier this year found that AI agents failed to complete tasks up to 70% of the time, making them almost entirely redundant as a workforce replacement tool. At best, they're a way for skilled employees to be more productive and save time on low-level tasks, but those tasks were already being handed off to lower-level employees. Having an AI do it and fail half the time isn't exactly a winning alternative.
ChatGPT
Welcome to the ChatGPT community! This is a place for discussions, questions, and interactions with ChatGPT and its capabilities.
General discussions about ChatGPT, its usage, tips, and related topics are welcome. However, for technical support, bug reports, or feature requests, please direct them to the appropriate channels.
Rules
- Stay on topic: All posts should be related to ChatGPT, its usage, and relevant discussions.
- No support questions/bug reports: Please refrain from posting individual support questions or bug reports. This community is focused on general discussions rather than providing technical assistance.
- Describe examples: When discussing or sharing examples of ChatGPT interactions, please provide proper context and explanations to facilitate meaningful discussions.
- No self-promotion: Avoid excessive self-promotion, spamming, or advertising of external products or services.
- No inappropriate content: Do not post or request explicit, offensive, or inappropriate content. Keep the discussions respectful and inclusive.
- No personal information: Do not share personal information, including real names, contact details, or any sensitive data.
- No harmful instructions: Do not provide or request instructions for harmful activities, illegal actions, or unethical behaviour.
- No solicitation: Do not solicit or engage in any form of solicitation, including but not limited to commercial, political, or donation requests.
- No unauthorized use: Do not use ChatGPT to attempt unauthorized access, hacking, or any illegal activities.
- Follow OpenAI usage policy: Adhere to the OpenAI platform usage policy and terms of service.
Thank you for being a part of the ChatGPT community and adhering to these rules!
"AI agents failed to complete tasks up to 70% of the time."
"Having AI do it and fail half the time"
Did AI write this article too? Fails again at basic math
50% is up to 70%.
Probably the are different use cases and some fail more than others.
Shit, there are more that 2 copilots at Microsoft alone

Could be 70% overall, but only half of the tasks it's good at.
Copilots numbers likely that high merely because it rides along with Office365... I tried using it a few times, and was completely useless. Even failed at sorting a spreadsheet with a few parameters
i think MS is just putting AI into everything, just for the sake of it, they dont really care if its useful or not at this point. they just need to buy time to soften the blow when the bubble bursts.
We put it in everything, and bet all of our money on it to "soften the blow"
yeah it really feels like incompetence rather than a strategy to weather the bubble pop :p
MS is running into a real problem where its two major product lines, Windows and Office, don't have any major improvements that justify an upgrade. It is an existential crisis to the company's profitability.
Now, MS has been able to make Office into a subscription, but it can't do that with Windows.
I wrote a page long documentation on a project. I asked Copilot to "format it to look nice but do not change a word"... it told me how to make some headings bold (would not do it itself) and what not... that's the "assist" I got
I couldn't even copy/paste the format since it's reply window does not apply and the text it provided was interlaced with its own stupid comments letting me know bold headings are more visible than regular font
I get better results just bouncing ideas off my cats
That was my experience. Wife had work telling her to use it, she asked me for help. I tried to get it to do things and all it would do is suggest stuff that we both knew perfectly well how to do with shortcuts. As for anything complex like have a chat and generate a document: fuck no. Might as well go to chat-jippity and copypasta it's result and format it yourself. Utter waste of time. I don't see why it's there, I can't find a use-case.
I work with sensitive data... so I often grab a real message, gut all the PHI and refill with some fake data.
I had a project where I needed to do a lot of these, so I got DeepSeek to give me a list of superhero "real names", DOBs, gender and a few other fake things so I could automate filling these messages with fake data. This is the most success I have had with AI and even then it messed up (minimally) thinking for some reason that Wanda was a guy hahahaha
HP: if your non original printer cartridge fails 7 out of 10 times, is it really a savings?
Microsoft: we want 80% of our work to be handled by these ais that fail 70% of the time.
Also fuck HP.
And Microsoft.
Say you hire an employee and you know he fucks up 50% of his tasks, that means you still have to do 50% of the work PLUS examine 100% of his output in great detail to figure out which 50% he fucked up.
Even if the employee was paid 0, I would want him gone.
My employer is all in on Microsoft, copilot is terrible, it can't even find a word in a document. Cntl+F find's it no problem.
Now when I have a tech issue that I need an answer for, the bing AI generally gets me a detailed answer on the first try. But it's my understanding that bings AI is just a reskined ChatGPT.
What's with all the shills in that comment section? Yeesh.
There's a lot of money and a lot of careers riding on this bullshit somehow becoming successful.
A sentence for the ages.
A timeless sentence.
That's like people trying to save the Titanic by bailing water with shot glasses.
There's a lot of money and a lot of careers ~~riding~~ spent on this bullshit somehow becoming successful.
Both can be true
MS middle management trying to save their lucrative jobs.
All of them have exactly 1 post too.
actually 70% of a post.
30%, get it right.
Sorry used copilot for the math
I hope all the money thrown at this "AI" (misnomer, IMHO - it's really just extremely overwrought pattern matching) causes at least some significant humbling (if not outright downfall) of some tech giants. I haven't programmed in a couple decades, and yet even I could tell they weren't gonna get to AGI offa this crap - I can't believe how badly some of these supposed techies fell for their own hype.
When discussing it, I often call it "simulated intelligence", because at the end of the day that's what neural networks are.
Edit: only to non-technical people, as simulations are a different thing.
In science fiction I've often seen the term VI (Virtual Intelligence) to refer to machines that look intelligent, and could probably pass a Turing test, but aren't really intelligent (normally VI coexists with actual AI, often used as interfaces, where it would be a waste, or too risky, to use a proper AI).
LLMs look a bit like that, though they're probably too unreliable to use as an interface for anything important.
the correct term is Stochastic Parrot... that is what LLM do. It sound even more cool that AI imho
Gihub died for this.
Comments in the post contain so much cope.
Congrats to all the insiders are Microsoft who were able to make money off a hype cycle. They'll fall for the next one too.


