this post was submitted on 28 Apr 2026
11 points (100.0% liked)
Free Open-Source Artificial Intelligence
4688 readers
2 users here now
Welcome to Free Open-Source Artificial Intelligence!
We are a community dedicated to forwarding the availability and access to:
Free Open Source Artificial Intelligence (F.O.S.A.I.)
More AI Communities
LLM Leaderboards
Developer Resources
GitHub Projects
FOSAI Time Capsule
- The Internet is Healing
- General Resources
- FOSAI Welcome Message
- FOSAI Crash Course
- FOSAI Nexus Resource Hub
- FOSAI LLM Guide
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Yeah, done two separate things in this space.
Cover letter fine-tuning:
Llama-3.2-3B-Instructas the base, QLoRA via Unsloth (rank 16, 10 epochs). Trained on ~62 of my own cover letters, exported to GGUF, loaded into Ollama. Fits comfortably on 8GB VRAM with 4-bit quantisation. Noticeably more consistent than prompting a generic model for voice and style matching.Email classification: completely different story. Classifier models for routing emails into categories (rejection, interview scheduled, offer, etc.) don't need a GPU at all. DeBERTa-small runs on CPU in milliseconds. The hard part is the labeling pipeline. We bootstrapped with deterministic heuristics to auto-label high-confidence cases, then routed uncertain ones to a human review queue. Around 2,000 labeled examples was enough for meaningful accuracy.
vs RAG: for classification, fine-tuning wins cleanly. RAG is better when you need to reason over retrieved documents. If you're making a consistent categorical judgment, you want it baked into the weights, not reconstructed from context at inference time.
I build local-first process pipeline tooling at circuitforge.tech
Oh that’s really interesting! I’m also interested in the classification case. Can you tell me more or direct to where to learn more about DeBerta? Do you train it the same way? Prompt and response sets? Does it work on any open source model? I can only run up to 4B right now.