I've used Sayboard and Whisper, and liked both of them. Sayboard is faster, Whisper handles punctuation better.
Linux
From Wikipedia, the free encyclopedia
Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).
Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.
Rules
- Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
- No misinformation
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
Community icon by Alpár-Etele Méder, licensed under CC BY 3.0
If you do not have a real-time requirement, and a bad CPU/GPU I can recommend whispercpp. https://github.com/ggerganov/whisper.cpp
It is quite fast and can transcribe with timestamps. I guess if you have a fast CPU/GPU this transcription can happen faster than real-time.
Not sure about your specific needs, but I saw this recently:
Requires X11, doesn't work well on wayland.
I never used this, but what an interesting question. KDE connect seems to be able to input text from voice.
True, you could use text input and some voice input on the phone, like FUTO voice input
Well, I tinkered around a bit with Speech Note which has a good amount of features and is easy to install as a Flatpak. I think it has an option to do this, but requires a bit off fiddling, an extra tool and permissions for the Flatpak. I didn't find any software with a particularly good integration into the Desktop, though.
Also read about Blahst but didn't try it yet. Maybe that one is an option.
I use Talon Voice
https://github.com/Manish7093/IBus-Speech-To-Text
I tried this in Fedora/Wayland previously, and it seems to work in most applications. It uses "VOSK" models which the GUI can download automatically - you just pick your language and desired model size when setting it up.
When I was exploring this a few months ago, I noticed speech recognition models have moved on quite recently (e.g. https://github.com/openai/whisper which can be run locally) but didn't see anything integrating it into an input-method like the above.