192
top 50 comments
sorted by: hot top controversial new old
[-] TheGiantKorean@lemmy.world 124 points 11 months ago

Does it tell you to Google the problem and then downvote you?

[-] MagicShel@programming.dev 34 points 11 months ago

Hence recursion since Google just takes you back, which leads to stack overflow because there is no exit condition.

[-] TheGiantKorean@lemmy.world 7 points 11 months ago

Which would be especially messed up if your original question was about recursion.

[-] SkyeStarfall@lemmy.blahaj.zone 6 points 11 months ago

This bullshit happens too often lmao

"Googles problem, finds post"

"Why are you asking this use Google"

Gee, thanks

[-] AnonymousLlama@kbin.social 6 points 11 months ago

"to keep the quality of answers high, we may arbitrarily close questions, regardless of how many upvotes it gets and how helpful it is" - stackoverflow

load more comments (1 replies)
[-] groucho@lemmy.sdf.org 54 points 11 months ago

That would be pretty easy.

return "Why are you even trying to do it this way?\n$link_to_language_spec\nThis should be closed.;

[-] iByteABit@lemm.ee 10 points 11 months ago

Meanwhile language spec:

  • Extremely high level description along with some implementation details you don't care about

  • function signature

[-] groucho@lemmy.sdf.org 5 points 11 months ago
[-] iByteABit@lemm.ee 3 points 11 months ago* (last edited 11 months ago)

I love how it was obvious what language I'm talking about without saying anything specific

[-] little_hoarse@sh.itjust.works 36 points 11 months ago

Good way to kill your own platform, the whole point is to ask questions to real people

[-] wagesj45@kbin.social 18 points 11 months ago

I thought the point was a mental BDSM exercise where you come to others for help and are instead punished for your ignorance.

load more comments (1 replies)
[-] CeeBee@programming.dev 30 points 11 months ago

It really puts their stance on "no AI generated answers" in a different light.

Basically, "no AI generated answers unless we do it".

[-] lightsecond@programming.dev 3 points 11 months ago

Well, using ai-generated answers to train their own ai would bring down the quality of answers and worse quality means lesser money. Don’t you want them to make any money??!!

[-] TheCee@programming.dev 27 points 11 months ago

Nice choice of logo colors, btw.

[-] Jummit@lemmy.one 9 points 11 months ago

I just noticed...

[-] csolisr@communities.azkware.net 27 points 11 months ago

Stack Overflow is unique as a page, in the sense that its contributions are under a license that allows for reuse (Creative Commons Share-Alike) as long as the individual users are properly credited. Does this mean that OverflowAI keeps the credit metadata and knows who wrote each individual part of an answer?

[-] MagicShel@programming.dev 17 points 11 months ago

AI doesn't work that way. No one wrote "part of the answer." It's more like each contributor casted a vote on what the next token should be and it randomly picks one of the top ten voted tokens. (Very very roughly.)

[-] csolisr@communities.azkware.net 4 points 11 months ago

Fair enough, but at least there should be a way for OverflowAI to list which contributors had the strongest link to the given answer, right?

[-] MagicShel@programming.dev 14 points 11 months ago* (last edited 11 months ago)

Edit: definitely read the other responses because apparently there are some techniques I wasn't aware of and don't understand nearly as well as I understand the underlying AI technology - and I'm only an enthusiast layman.

I don't think there is any way of doing that. AI is like a huge matrix that says 'if (' is followed by

' x': 60%

' foo': 19%

' person': 9%

Etc.

And then it does it all over again for the next token based on randomly selecting one of the tokens and then saying 'if ( person' is followed by

'.id': 30%

'.name': 27%

Etc.

So just to write a simple 'if person.name.startsWith("foo") {' is the aggregate result of thousands of contributors - really pretty much every author of every code snippet ingested from the training material.

There is no single author even if the code matches existing code token for token. The only exception would be code that is so esoteric that there is only a single author writing code that does a particular thing. But even in that case, there is nothing in the probability matrix to indicate that a particular sequence of tokens is unique to a certain author. Best you could do is full text search a line of code to see if it matches anything in the training data and if there is a very small set of authors to whom credit might be assigned. That might be possible, but it would be an add-on (and significant performance hit) to the actual AI itself. Sort of like how browser integrated AI just runs a search and feeds the result into the context to make the output more likely to contain information in the top results.

[-] TehPers@beehaw.org 3 points 11 months ago* (last edited 11 months ago)

It depends. The base model, sure you can't really figure out what percentage of it came from which data source since there's just too many data sources and that information is lost along the way. They're likely not using the entirety of SO to generate answers though. Retraining LLMs is ungodly expensive, so they can't retrain it every time a new Q or A is created, and even retraining on a regular basis would be impractical.

Instead, without knowing exactly how they're doing it of course, my guess is they're pulling relevant Q&As from their database, then using those results to improve the response (for example by providing them as context). If you're interested, look into retrieval-augmented generation.

load more comments (1 replies)
[-] ShustOne@lemmy.one 6 points 11 months ago

Check out the article and feature video. It does appear to link to answers it pulled from. Bing and Bard do the same. Posters saying it's impossible are mistaken.

[-] wagesj45@kbin.social 4 points 11 months ago

Posters aren't saying that its impossible to put search results through an LLM and ask it to cite the source it reads. They're saying that the neural networks, as used today in LLMs, do not store token attribution in the vocabulary or per node. You can implement a system for the neural network to work in that provides it the proper input (search results) and prodding (a prompt that encourages the network to biasing toward citation), not that the single LLM can conceptualize of that on its own.

[-] csolisr@communities.azkware.net 4 points 11 months ago

Thanks for the TLDW - I could ogle a bit of the article but since I was at work, I couldn't just play the video out loud.

load more comments (2 replies)
[-] em7@programming.dev 9 points 11 months ago

Then I'm guilty of breaking the license. I have always been stealing code from Stack Overflow. Well, since I'm a senior dev right now I steal only from answers.

[-] ShustOne@lemmy.one 5 points 11 months ago

It does seem to do that in the feature video. It appears to link to all the answers it pulled from.

[-] gbuttersnaps@lemmy.world 19 points 11 months ago

The only answer you ever get is "Closed: Marked as duplicate question."

[-] JackbyDev@programming.dev 17 points 11 months ago

I feel like a better solution is to have a community answer as generative AI to every new question and have folks upvote or downvote it like normal.

load more comments (1 replies)
[-] TwinTurbo@lemmy.world 16 points 11 months ago

No users to answer questions? No problem…

[-] genericnickname@lemmy.world 15 points 11 months ago

I'm not liking the announced changes to search. That sounds like we will be losing the lexical search and in exchange we will be getting the same technology that allows google to answer questions different to the one we asked.

How many minutes between starting to use OverflowAI until we get something like "As a large language model trained by the Stack Exchange Network i can not answer duplicated questions".

load more comments (1 replies)
[-] Deely@programming.dev 15 points 11 months ago

Do we have a term for combination of enshittyfication and LLM?

[-] maiskanzler@feddit.de 11 points 11 months ago

Maybe add NFTs into the mix too. But don't tell wsb and the GME gang.

[-] argv_minus_one@beehaw.org 14 points 11 months ago

I look forward to the AI trend fizzling out. It's only slightly less silly than the cryptocurrency trend was.

[-] alansuspect@aussie.zone 5 points 11 months ago

It reminds me of 3D

load more comments (2 replies)
[-] beirut_bootleg@programming.dev 13 points 11 months ago

I get the whole community resource and all that hoorah, but what bothers me the most is that C*O somewhere that's padding his bonus and CV, waiting for the ship to sink so he can move on to the next thing where he can sing praises to the AI revolution.

[-] kiwiheretic@lemmy.ca 13 points 11 months ago

I understand Google and Microsoft getting into it as it makes sense as a "better" Google search but for StackOverflow that sounds like they have just given up on their current platform.

[-] whataboutshutup@discuss.online 10 points 11 months ago

Many coding languages, mixed text and code, just plain wrong answers (commented as such). What can go wrong?

They can DDOS themselves to show raise in visits but it won't help long-term.

[-] UlrikHD@programming.dev 7 points 11 months ago

Can someone tell me what their angle is? Are user's supposed to curate and help train the model for free? Is it just a model trained on stackoverflow data?

All their data is open so what edge do they over the already established competition.

[-] ShustOne@lemmy.one 7 points 11 months ago

This type of Q&A interface is very popular and stealing traffic away from sites like Google and Stack Overflow. Stack Overflow can train it on their data and has a feature where it links to every answer it pulled from. I think that's a nice feature and like that I can troubleshoot further on my own, as AI can often hallucinate an answer or lose a piece of context I need.

load more comments (1 replies)
[-] lazyvar@programming.dev 5 points 11 months ago

Well that explains why they did a 180 on their "no AI" rule, which has the mods in a tizzy.

Who knows, maybe it'll cut back on the toxicity in the sense that you don't have to interact with toxic people ¯\_(ツ)_/¯

load more comments (1 replies)
[-] programmer@programming.dev 5 points 11 months ago

They only had to improve the search and kept it a human platform!

[-] NECOdes@burggit.moe 4 points 11 months ago
load more comments (1 replies)
[-] Gsus4@lemmy.one 4 points 11 months ago* (last edited 11 months ago)

Hah, good to know that even on programming@programming.dev there are people who agree that stack overflow moderation is too draconian to ask questions in anymore. It's a good resource, though, so an LLM will probably be the answer to make the knowledge base more usable without angering its elder gods.

load more comments (1 replies)
load more comments
view more: next ›
this post was submitted on 27 Jul 2023
192 points (97.1% liked)

Programming

16197 readers
462 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 1 year ago
MODERATORS