macOS + Windows alpha · local-first voice workspace

Turn messy voice notes into local action docs

SpeechToDo watches your local workspace folder, processes your recordings, and creates markdown transcripts, summaries, and tasks you actually own.

SpeechToDo desktop app with an inbox of voice notes and extracted to-dos.
Built around the way founders already capture
Local-first File-native Cloud from $6 In-app checkout iCloud-friendly

A local voice workspace
for Mac and Windows.

SpeechToDo is built for people who want recordings to become files they can keep, move, and script without a closed notes dashboard.

1 workspace

Your recordings start in your workspace.

Choose a local or iCloud-backed folder. SpeechToDo records there, watches for new audio, and keeps generated markdown beside the source.

  • Record straight into a folder you control.
  • Drop any audio file into the watched workspace.
  • Keep source audio and outputs as normal files.
  • audio lecture-raw.m4a
  • md lecture-notes.md
  • md lecture-todos.md
No meeting bot

Capture meetings and lectures without inviting a bot.

Record system audio from calls, classes, videos, and apps. Useful when you are listening with a headset and still want clean notes afterward.

  • No visible attendee added to the call.
  • Works with meetings, lectures, videos, and app audio.
  • Designed for the bot-fatigued note-taking workflow.
Zoom Meet Lecture Video
system-audio.m4a Captured locally
Workflow + CLI

Built for workflows, not another notes silo.

Use workflow tags, templates, and the CLI to route recordings into the right output. SpeechToDo fits into local files and agent workflows instead of forcing every step through a closed dashboard.

  • Tags can auto-pick the right template.
  • Custom templates shape outputs for each use case.
  • CLI support keeps the workflow scriptable and agent-native.
$ speechtodo watch ~/VoiceWorkspace $ speechtodo process lecture-raw.m4a --tag #lecture template: lecture-notes created: lecture-notes.md
#meeting -> meeting-notes #lecture -> study-notes #customer-call -> follow-up
3 artifacts

Every recording becomes a clean artifact.

SpeechToDo turns voice notes into three practical outputs: clean notes, todos, and a share-ready artifact you can export or reuse.

  • Clean notes for sharing and review.
  • Todos separated from raw transcript noise.
  • One portable artifact ready for export or reuse.
notes meeting-notes.md
todos meeting-todos.md
export PDF / Markdown

Smaller details,
stronger workflows.

After capture, SpeechToDo keeps improving the artifacts: cleaner language, smarter tags, practical exports, and a path into the tools your team uses.

Refine

Teach SpeechToDo the words that matter.

Add names, product terms, acronyms, and repeated corrections to a vocabulary library so future notes sound closer to your real context.

Vocabulary library Names, apps, projects, acronyms
Smart tags

Let notes carry their own routing hints.

Smart AI tags help distinguish meetings, lectures, customer calls, ideas, and follow-ups without making every recording a setup chore.

  • tag #meeting
  • tag #lecture
  • tag #follow-up
PDF-ready

Shareable exports

Turn a cleaned note into something you can send after the call.

Coming soon

Notion upload

Keep capture local, then publish the finished artifact when it is ready.

  • -
    Setup Map your real capture flow
  • -
    Templates Tune the output shape
  • -
    Review Improve the workflow with use

Founder-led beta

Early users get help shaping SpeechToDo around real recordings.

Cloud path

Hosted AI without giving up your files.

Start with the hosted beta path for transcription and artifact generation, while SpeechToDo keeps the finished work in your local workspace.

$6 Cloud from
$16 Pro
$30/seat
Files

Questions before you hand voice work to a tool.

SpeechToDo is built for operators who want useful output without giving up their local files.

Where do my recordings live?

In your workspace. SpeechToDo watches the folder you choose and writes markdown outputs beside your source files.

What does each recording produce?

A transcript, a summary, and action items as editable markdown files you can keep, move, or reuse in your existing tools.

Is this only a hosted SaaS dashboard?

No. Cloud is the easiest default, but the product still centers on workspace-owned artifacts. Your recordings and markdown outputs stay in the folder you choose.

How does pricing work?

Cloud plans include hosted AI: Student is $6/month (20h), Pro is $16/month (60h), and Business is $30/month (120h). Choose the tier from the desktop Account panel.

What if I need more usage?

Move up a Cloud plan for a higher monthly limit. Business gives heavier workflows more hosted AI capacity.

Built for the way
operators actually work.

"I capture decisions on a walk. By the time I'm at my desk, the markdown is sitting in the folder."

Solo founder · AI tooling

"I don't want my voice notes living inside someone else's dashboard. Files I own, full stop."

Operator · ops + automation

"Transcription was the easy part. Turning it into something I'd actually reuse — that's what was missing."

Indie hacker · shipping daily

Try the desktop alpha
with your own recordings.

Download the early Mac or Windows build, point it at a workspace you control, and turn recordings into markdown transcripts, summaries, and action docs.

Alpha release SpeechToDo Alpha
  • Mac notarized build and unsigned Windows ZIP
  • Local workspace and markdown outputs
  • Feedback shapes the paid beta
Cloud plans

Choose a tier inside the desktop Account panel.

Create or sign in to a SpeechToDo account in the app, then pick the Cloud tier that matches your monthly recording volume. Checkout opens with your account attached.

Cloud checkout opens from the desktop app so your subscription attaches to the right account.

Markdown iCloud Gemini Obsidian Drive Notion soon MCP soon CLI Markdown iCloud Gemini Obsidian Drive Notion soon MCP soon CLI