this post was submitted on 28 Apr 2026
11 points (100.0% liked)
Free Open-Source Artificial Intelligence
4688 readers
2 users here now
Welcome to Free Open-Source Artificial Intelligence!
We are a community dedicated to forwarding the availability and access to:
Free Open Source Artificial Intelligence (F.O.S.A.I.)
More AI Communities
LLM Leaderboards
Developer Resources
GitHub Projects
FOSAI Time Capsule
- The Internet is Healing
- General Resources
- FOSAI Welcome Message
- FOSAI Crash Course
- FOSAI Nexus Resource Hub
- FOSAI LLM Guide
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Thanks! That’s interesting that RAG alone would be better than a tuned model. Why is that? What If you have a very specific task, like writing copy based off existing documents and decisions are based on a set of specific variables?
What if you use RAG and tune it? Any benefit there?
Last question. Would a fine tuned model be more energy efficient than a model using RAG?
Honestly it heavily depends on the use case, in terms of making the model better and choosing between RAG/FT. The most important thing to consider is what sort of changes you want to make to the model. FT is still a good choice if you're looking for: strict output formatting (json/yaml/...) and refining for highly specific, narrow domain tasks. RAG is better for knowledge freshness, having source citations, and greatly lowers hallucinations.
RAG will inflate your context windows (more tokens) at inference time, so slower responses and requiring more energy at compute, whereas fine-tuning takes a ton of gpu compute up front (but retains smaller token counts at inference). If you're doing 100,000 prompts a day, and only need to train once, FT makes more sense; if you're doing 100 prompts a day and your knowledge database is constantly changing, RAG makes the most sense.
It's hard to give a formalized estimate on energy efficiency: fine-tuning and getting to a certain training accuracy can take some undeterminate amount of time (and money on rented GPU compute), but could be a better choice if you think that up-front cost will be paid off over time if you use the model very frequently and only fine-tune once. On the other hand, going the RAG route will have an absolutely free up front compute (energy) cost, but be slightly more at compute time due to more tokens.
What's your specific task you're considering for FT or no FT? This is the most important thing to choose.
Thanks for the explanation!
The use case is writing marketing communications to match a library of content that a company has already written.
We’re currently using RAG and it’s okay, but I’m wondering how much better it would be if it were tuned.