Roadmap
7 May 2026
Manuel
The curated picture — not a backlog dump. Below: what's actually shipped (the receipts), what I'm building right now, where attention drifts next, and a short list of things I deliberately won't build.
read first
Directions, not promises.
Dates are quarter-grained. Priorities shift. A few of these will change before they ship — that's honest. Snapshot, not contract.
Built
The receipts — what's already in the desktop app and working for the dev cohort today.
—
built
Private-by-architecture pipeline
Audio captured, transcribed and structured entirely on your machine. Whisper runs on your GPU; the optional structuring LLM runs on your GPU. The server never sees voice or transcripts — by design, not by policy.
—
built
Thought-to-Structure
After Whisper produces a raw transcript, an optional on-device LLM reshapes it into prose, bullets or Markdown. One hotkey, your GPU, zero round-trips. Toggle per recording or set a default.
v0.2.8
built
Linux + macOS, native packages
One-line installer for Debian, Ubuntu, Mint, Pop!_OS, Fedora, RHEL, Rocky, AlmaLinux, Arch, CachyOS, EndeavourOS, Manjaro and close derivatives. Signed universal .dmg for Apple Silicon and Intel Macs.
—
built
Ten transcription languages
English, German, French, Spanish, Italian, Portuguese, Dutch, Japanese, Chinese and Korean — plus auto-detect. Structured output is sharpest in English and German today; the rest follow.
—
built
Bluetooth + AirPods, no profile dance
Audio capture goes through a modern browser audio interface inside the overlay, which negotiates the Bluetooth audio profile cleanly. AirPods and other BT headsets work on Linux and macOS without manual profile switching, and survive long recordings or mid-session device changes.
v0.2.9
built
In-app auto-update
Update banner with one-click Download & Install: the app fetches the right package for your distro, shows a clear confirmation (signature check, exact install command), installs under your sudo password and relaunches.
v0.2.12
built
Adaptive Bluetooth music resume
After auto-pausing for a Bluetooth dictation, VocaPulse watches PipeWire for the headset to settle back to A2DP and fires the resume signal the moment it's ready — instead of waiting a fixed 2 seconds. Music comes back when it's ready, not on a timer.
Building now
Three active threads. ETAs are quarter-grained on purpose.
vocapulse.app — public launch
Q3 / Q4 2026Pre-launch waitlist running now; beta opens to the first 25 testers in Q3 2026 (free), early access opens right after at €7.99/month locked for 12 months, public launch lands in Q3 / Q4 2026 at €9.99/month.
Multi-device routing (LAN-only, encrypted)
v1.x · post-launchSpeak into the laptop you carry; have the structured output land in the agent on the desktop you're sitting at. Peer-to-peer over your local network, end-to-end encrypted, no cloud relay. The marquee v1.x feature.
Windows support
explorationLinux and macOS both ship today; Windows follows. Same private-by-architecture pipeline — the constraint is engineering parity for the audio capture and overlay layers, not philosophy.
On the radar
Directions we lean toward. No dates here. If a card moves up, it gets a real ETA.
Persona-based structuring
Per-context structuring templates — Coder, Designer, Writer, Reviewer — that tune the on-device LLM's reshape pass for the kind of work you're doing. Today's three formats already cover most cases; personas would sharpen the long tail.
Browser extension
Cleaner paste behaviour in web tools that don't play well with synthetic keystroke input. Privacy posture stays the same — no audio ever routed through the browser.
Higher-accuracy transcription tier
A larger, slower model for users who care more about every-word accuracy than millisecond latency. Opt-in, per-recording or per-device. Still local, still on your GPU.
Custom vocabulary + dictionaries
Teach VocaPulse the names, acronyms and domain terms you actually use, so transcripts stop mangling them. Per-account, per-device, never sent anywhere.
Flatpak distribution
Native packages cover most of Linux today; Flatpak is in development for distros where it's the preferred channel. Sandboxing constraints around audio capture are the engineering cost.
and a short list of nos
Not building
Asked about regularly, deliberately not on the roadmap. Saying no with reasons is part of the deal — and it's how you can tell the rest of this page is honest.
AppImage distribution
noAsked occasionally, intentionally not built.
why: AppImage's runtime can't reliably access the Linux audio subsystem we depend on for low-latency capture. Native .deb / .rpm / pacman packages cover the same distros without that constraint, and Flatpak is in development for the rest.
Cloud transcription fallback
noAsked occasionally, won't be built.
why: The whole product is the architecture: audio never leaves your machine. A cloud fallback — even opt-in — would erode the structural privacy claim and create a path for accidental leakage. The honest answer is no, not even as an option.
Voice cloning / TTS
noDifferent product, not on the roadmap.
why: VocaPulse is dictation. Voice synthesis and cloning are adjacent but separate problem spaces with their own privacy and ethical surfaces. Out of scope.