this post was submitted on 10 Aug 2025
20 points (76.3% liked)

DIY Electronics and Hardware

171 readers
1 users here now

founded 5 months ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] rbn@sopuli.xyz 11 points 2 months ago* (last edited 2 months ago)

For converting your spoken words into text, it taps into OpenAI’s Whisper model, an automatic speech recognition system renowned for its accuracy and ability to handle various accents and background noise.

Have the hardware requirements of Whisper dropped significantly over the last few months? I played around with it in context of home assustant year of the voice. Despite using a (4 year old) ThinkPad with 32 GB of RAM and a 4 core (8 threads) i7 the accuracy and performance of Whisper was still not at a point that I'd use for productive use.

A rather simple sentence like 'turn the light in the living room on' worked maybe in 70% of the cases if I sat right next to the microphone and without any background noise. With music playing in the background or other people talking in parallel it dropped to ~25% accuracy.

If it now runs just fine on a Raspberry Pi Zero that would be a massive improvement!