636
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 12 Nov 2024
636 points (96.0% liked)
Technology
60105 readers
1931 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
One of the reasons I love StarCoder, even for non-coding tasks. Trained only on Github means no "instruction finetuning" bullshit ChatGPT-speak.
People still run or even continue pretrain llama2 for that reason, as its data is pre-slop.
I really wish it were easier to fine-tune and run inference on GPT-J-6B as well... that was a gem of a base model for research purposes, and for a hot minute circa Dolly there were finally some signs it would become more feasible to run locally. But all the effort going into llama.cpp and GGUF kinda left GPT-J behind. GPT4All used to support it, I think, but last I checked the documentation had huge holes as to how exactly that's done.