#voicetyping

Debby ‬⁂📎🐧:disability_flag:debby@hear-me.social
2025-11-21

🗣️🎤📝 :linux: Speech to Text and Text to Speech on GNU/Linux :disability_flag: 📝🔊💻

Why This Matters to Me (and Maybe You Too)

If you’re anything like me—a Linux user who counts on voice typing and TTS because of visual impairment—you know that accessibility is not a luxury, it’s a necessity. Speaking from experience as someone who depends on voice typing (and TTS) , the quest for a seamless, local, FLOSS speech-to-text (STT) setup on Linux can be frustrating.
Here’s how you can succeed with modern tools using Linux. FLOSS means freedom and privacy; working locally means real control.
Let’s dive in! I’ll tell you what I’ve learned and what I use—and hope you’ll share your favorite tools or tips!

System-Wide Voice Keyboard: Speak Directly in Any App

Want to speak and have your words typed wherever your cursor is—be it a terminal, browser, chat, or IDE? Here’s what actually works and how it feels day-to-day:

- Speak to AI (Offline, Whisper-based, global hotkeys)
This tool is my current go-to. It uses Whisper locally, lets you use global hotkeys (configurable) to type into any focused window, and doesn’t need internet. Runs smoothly on X11 and Wayland; just takes a bit of setup (AppImage available!).
GitHub Repo github.com/AshBuk/speak-to-ai) | Dev.to Post dev.to/ashbuk/i-built-an-offli)

- DIY: RealtimeSTT + PyAutoGUI
For the true tinkerers, RealtimeSTT plus a Python script lets you simulate keystrokes. You control every step, can lower latency with your tweaks, but you’ll need to be comfortable with scripting.
RealtimeSTT Guide github.com/KoljaB/RealtimeSTT#)

- Handy (Free/Libre, offline, Whisper-based, acts as a keyboard)
I’ve read lots of positive feedback on Handy—even though I haven’t tried it myself. The workflow is simple: press a hotkey, speak, and Handy pastes your text in the active app. It’s fully offline, works on X11 and Wayland, and gets strong accuracy thanks to Whisper.
Heads up: Handy lets you pick your own shortcut key, but it actually overrides the keyboard shortcut for start/stop recording. That means it can clash with other tools that depend on major shortcut combos—including Orca’s custom keybindings if you use a screen reader. If your workflow relies on certain shortcuts, this might need adjustment or careful planning before you commit.
GitHub Repo github.com/cjpais/Handy) | Demo handy.computer)

Real-Time Transcription in a Window (Copy/Paste Workflow)

If you’re okay with speaking into a dedicated app, then copying, these options offer great GUIs and power features:

- Speech Note by @mkiol mastodon.social/@mkiol
FLOSS, offline, multi-language GUI app—perfect for quick notes and batch transcription. Not a system-wide keyboard, but super easy to use and works on both desktops and Linux phones.
Flathub flathub.org/apps/net.mkiol.Spe | LinuxPhoneApps linuxphoneapps.org/apps/net.mk)

- WhisperLive (by Collabora)
Real-time transcription in a terminal or window—great for meetings, lectures, and captions. Manual copy/paste required to get the text to other apps.
GitHub Repo github.com/collabora/WhisperLi)

More Tools for Tinkerers

If you like building your own or want extra control, check out:
- Vosk: Lightweight, lots of language support. GitHub alphacephei.com/vosk/)
- Kaldi: Powerful, best for custom setups. Website kaldi-asr.org/)
- Simon: Voice control automation. Website simon-listens.org/)
- voice2json: Phrase-level and command recognition. GitHub github.com/synesthesiam/voice2)

Pro Tips

- Desktop Environment: X11 vs. Wayland affects how keyboard hooks and app focus actually operate.
- Ready-Made vs. DIY: If you want plug-and-play, try Speech Note or Handy first. Into automation or customization? RealtimeSTT is perfect.
- Follow the Community: @thorstenvoice offers tons of open-source voice tech insights.

Screen Reader Integration

Looking for robust screen reader support? Linux has you covered:

- Orca (GNOME/MATE): The most customizable GUI screen reader out there. The default voice (eSpeak) is robotic, but you can swap it for something better and fine-tune verbosity so it reads only what matters.
- Speakup: Console-based, ideal for terminal.
- Emacspeak: The solution for Emacs fans.

💡 Orca is part of my daily toolkit. It took time to get the settings just right (especially verbosity!) but it’s absolutely worth it. If you use a screen reader—what setup makes it bearable or even enjoyable for you?

Final Thoughts

If you’re starting from scratch, try Handy for direct typing (just watch those shortcuts if you use a screen reader!) or Speech Note for GUI-based transcription. Both are privacy-friendly, local, and accessible—ideal for everyday Linux use.

Is there a FLOSS gem missing here?
Sharing what works (and what doesn’t!) helps the entire community.

Resources:
Speech Note on Flathub flathub.org/apps/net.mkiol.Spe
Handy GitHub github.com/cjpais/Handy
Speak to AI Guide dev.to/ashbuk/i-built-an-offli
RealtimeSTT github.com/KoljaB/RealtimeSTT

#Linux #SpeechToText #FLOSS #Accessibility #VoiceKeyboard #ScreenReader #Whisper #Handy #SpeechNote #OpenSource #Community #voicetyping #LocalSTT #TTStools #SpeechRecognition #A11y #Linuxtools #Voicekeyboard #Whisper #Handy #speech-to-text #SpeechNote #review #ScreenReaders #ORCA #FOSS

Speech to Text and Text to Speech on GNU/Linux 
A diagram showing the flow of speech to text and back to speech on GNU/Linux, with microphone, text document, and Linux logo icons illustrating open-source voice tools flexibility. 

Quick Comparison Table:
 Which One Should You Try First?
| Use Case               | Best Tool               | Notes                                  |
--------------------------
| System-wide typing | Handy or Speak to AI     | Acts like a keyboard in any app.     |
| Real-time window   | Speech Note or WhisperLive | Copy/paste workflow.                  |
| DIY flexibility    | RealtimeSTT + PyAutoGUI | For those who love scripting.        |
Mide mikemikeemikeee
2025-09-26

Turn your words into text with the PenPower AI VoiceWriter - speech-to-text, AI proofreading, rewriting, etc. Speak, edit, translate… all through your PC/Mac setup. Order now:

aimartz.com/product/penpower-a

2025-08-29

Ứng dụng mã nguồn mở Voice Typing Studio (VTS) thay thế tính năng đọc chính tả của macOS, dành cho lập trình viên và người dùng nâng cao. Ứng dụng cho phép nhập liệu bằng giọng nói ở bất cứ đâu mà không cần chuyển đổi app, sử dụng API của Deepgram, OpenAI hoặc Groq.
#opensource #macos #voicetyping #manguonmo #docchinhta

reddit.com/r/SideProject/comme

2025-06-19

Lilbits: Hacking the Humane Ai Pin, Liberux NEXX Linux phone, and swearing at your Windows PC

The Humane Ai Pin was supposed to be a wearable device that allowed you to interact with an AI assistant throughout the day without using your phone or computer. But when it hit the streets last year it was widely panned as an overpriced, underpowered device that largely failed to deliver on its promise.

So it wasn’t a huge surprise that when HP acquired the company this year, it shut down […]

#accessibility #ai #aiPin #aokzoeA1x #humane #humaneAiPin #liberuxNexx #lilbits #linuxSmartphones #openSource #penumbraos #profanityFilter #Steam #steamClientBeta #voiceTyping #windows11

Read more: liliputing.com/lilbits-hacking

2025-06-19

Microsoft adds a profanity filter setting for voice typing in the latest Windows 11 test builds - now you can turn it off and you can swear at your computer (and have it transcribe your foul language). blogs.windows.com/windows-insi #Windows #Windows11 #Microsoft #VoiceTyping #ProfanityFilter

unfa🇺🇦unfa
2025-05-31

If you're using an android phone you need this:

keyboard.futo.org/
youtube.com/watch?v=cFP5bp3JvaU

I have been on the lookout for a sensible Gboard replacement that wasn't making my (voice) typing experience painful, and so far only FUTO Keyboard managed to provide that.

It has really good offline voice typing as well, which is something I use a lot.

I can not recommend this enough!

Farooq | فاروق [Master Patata]farooqkz@cr8r.gg
2025-05-15

So I finally got some time to experiment with #VoiceTyping for #Luanti. But then I realized both main #MinetestCTF server and #JMA have disabled CSMs.

I asked Nanowolf to enable it and he agreed. LandarVargan hasn't given a reply yet.

#minetestcapturetheflag #foss #fossgaming #gaming #opensource #opensourcegaming

Joplin App Notes: Voice Typing and OCR

Joplin is a multiplatform free software that allows us to take and organise notes in markdown: we can link notes, attach audio recordings, drawings, files, or photos to each note, and also save a URL, website, or part of it using the Web Clipper (browser extension).

Recent versions of the app for mobile (not available on the desktop app) also allow for what is called as Voice typing, which means you can talk to the app and it transforms what you say into text. I believe it only works with English (I tried to speak in Portuguese and it was translated to English text), but I’m finding it quite useful. You need to create a new note, click on the three dots on the top right corner and choose Voice typing.

On the desktop app (not available on the mobile app), another nice feature is the ability to extract text from images or pdfs. Just right click on the file and choose View OCR text. You need to make sure it is enabled in the settings.

#FLOSS #Joplin #MobileApp #NoteTaking #NotesApp #OCR #Organization #Technology #VoiceTyping

PUPUWEB Blogpupuweb
2025-04-30

​Windows 11 now lets you disable the voice typing profanity filter—no more asterisks censoring your words. Press Win+H, click the gear icon, and toggle "Filter profanity" off. Available in Insider builds 26120.3941 and 26200.5570.

pupuweb.com/how-can-you-easily

​Windows 11 now lets you disable the voice typing profanity filter—no more asterisks censoring your words. Press Win+H, click the gear icon, and toggle "Filter profanity" off. Available in Insider builds 26120.3941 and 26200.5570. #Windows11 #VoiceTyping
2025-04-25

Lilbits: Recall, AI-enhanced search, and Click to Do are rolling out for Copilot+ PCs, Microsoft also preview support for swearing while using voice typing

Microsoft’s Copilot+ PC platform was predicated on the idea that Windows computers with processors that have newfangled neural processing units would be able to do all sorts of nifty things with AI. But the most impressive features that Microsoft promised have been slow to arrive… after the company faced backlash over the potential privacy and security implications.

Now Microsoft is […]

#clickToDo #cosmic #lilbits #microsoft #profanityFilter #recall #retroid #swearing #tariffs #voiceTyping #windows #windowsMaps #windowsSearch

Read more: liliputing.com/lilbits-recall-

2025-04-25

Microsoft is adding a "filter profanity" toggle to Windows voice typing. This lets you DISABLE the filter that is in place by default so that your curse words are spelled out without any * symbols. Apparently this feature was based on "top customer feedback." blogs.windows.com/windows-insi #VoiceTyping #Windows #Microsoft #Profanity

unfa🇺🇦unfa
2024-07-01

You may recall, I was trying to find a -respecting keyboard for Android that could work with FUTO Voice Input.

The best I could figure out was .
But suddenly people who made Voice Input published a keyboard...
And it's all I could ever ask for, plus the voice recognition is a couple times faster! It's INSANE.

The software isn't exactly , it's what I'd call "fair software" (referencing ).

Keyboard.FUTO.org

kurtshkurtsh
2023-12-01

One my favorite Windows 10/11 keyboard shortcuts remains Win-H for "voice typing" or voice-to-text recognition. Easy to access, super accurate & built-in the OS.

Here's the list of special phrases to say for punctuation & voice commands:

support.microsoft.com/en-us/wi

Hi TechonolgyHitechonolgy
2023-11-06

Exciting news for Pixel fans! The Pixel 7 brings voice typing to a whole new level. 🎙️✍️ Get a sneak peek of what's in store for the Pixel 8 innovations. Stay tuned for more at hitechonolgy.com/2023/10/pixel

Paul O'Malleypaulomalley@c.im
2023-07-22

Did you know that you can use voice typing in Google Docs to create amazing content without typing a single word? 🗣️

Voice typing is an excellent feature that allows you to dictate your thoughts and ideas to Google Docs, and it automatically transcribes them for you. It's perfect for people who have difficulty typing, want to save time, or just prefer to speak rather than type. 🙌

You just need a microphone, a Google account, and a browser. You can also edit and format your document with voice commands, such as "bold that" or "insert bullet list". 🚀

Voice typing is not only a convenient tool, but also an accessible one. It can help people with disabilities, injuries, or other limitations to express themselves and create content. 🌎

What do you think about voice typing in Google Docs? Have you tried it before? Do you have any questions or feedback for me? Let me know below. I'd love to hear from you. 💬
youtu.be/WgANbfKi7Mk

#VoiceTyping #GoogleDocs #Accessibility #GoogleWorkspace #A11y #YouTube #Boost

YouTube Thumbnail Image featuring the Google Docs Logo and the caption "Type with Your Voice!"
Matt DainesMattDaines
2023-06-25

Any keyboards that support better voice typing in any app?

Something like the OneNote app voice typing where you can amend punctuation without the microphone disabling?

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst