this post was submitted on 11 Mar 2025
-8 points (30.0% liked)
Community Promo
3979 readers
1 users here now
Promote your favourite communities and groups!
This includes communities on Lemmy, Piefed, Mbin, NodeBB, Matrix, Signal, etc.
๐ฃ๏ธ Tips for promoting a community:
- Use a descriptive title
- Link to the community using the universal format: !communitypromo@lemmy.ca
- Add an image or icon to your post
While you can still ask about communities that you are looking for, we recommend that you:
- Post in !lemmy411@lemmy.ca (not limited to Lemmy communities)
- Check out this guide: How to Find Communities (fedecan.ca)
Community Rule(s):
- Reposts: Outside of major community changes (ex. events, new management), please limit posts about about a particular community to 1 post / month
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I use a finetuned T5 summarisation model that is relatively accurate. It has some minor issues with occasional miss assigning quotes but it doesn't hallucinate like a traditional GPT style model does. It is 60% identical to that of a human summary and >95% accurate in terms of meaning. It is more accurate than traditional nonai based summarisstion tools (I'm not sure how it compares to a human) but I belive it is as accurate and nonbias as possible.
Its biggest flaw is actually the traditional nonai web scraper which sometimes pulls the wrong content. Its all foss so if u wanna go make a pull to improve it that would be greatly appreciated.
EDIT: I've been experimenting with having a tradition GPT LLM look over the summary and original to catch these errors but have had little to no success without using large models which I cannot run on my local hardware (I unfortunately can't afford to pay for inference at the scale my bot runs).
Thanks for the explanation. I think if you combined that with a method to retract or edit summaries based on human reports, you can probably fill in the remaining 5%. I am unsure how feasible that would be though. Good luck with the community!
Yeah I'm not sure how that can be achieved in a way where I single report can catch errors without letting every single user mess with it. I could perhaps expose the section breakdown to users and allow users to regenerate specific sections but that would require a lot more complex interaction. But thanks for the suggestion tho I'll look into it.