Local-first transcription

Transcription should produce files you own.

SpeechToDo starts from a simple premise: your voice notes should become portable markdown artifacts in your workspace, not another locked dashboard you have to remember to visit.

workspace/voice-artifacts.md
# Local-first voice workflow

## Source of truth
- Original audio file
- Transcript markdown
- Summary markdown
- Action document markdown

## Ownership model
Files live in the user's workspace first.

## Hosted help
- Used where the beta needs processing
- Kept separate from the file-native surface

Local-first is a workflow boundary.

It means the product is organized around files, folders, and user-owned artifacts before hosted dashboards or closed notes.

1

Your workspace stays central

SpeechToDo watches a folder you control and writes useful markdown artifacts beside the audio that created them.

2

Artifacts are portable

Transcript, summary, and action files can move into your editor, issue tracker, docs, Obsidian vault, or archive.

3

Hosted intelligence is optional value

The beta may use hosted processing where needed, but the durable product surface remains your file system.

What we mean by private.

Privacy claims should be precise. SpeechToDo is early, so the marketing site should describe the workflow honestly.

Original audio

Recordings start in the folder or synced workspace you choose. SpeechToDo treats that workspace as the durable source of truth.

Generated artifacts

Transcript, summary, decision, and task markdown are written back as editable files you can review, move, archive, or delete.

Hosted processing

The current beta may use hosted transcription or intelligence providers where needed to produce the requested artifact.

Policy surface

The privacy policy stays separate from positioning copy so beta users can review the current data handling terms directly.

Read the privacy policy

Current beta principles

  • Your workspace is the primary interface.
  • Generated markdown files are written back to your storage.
  • Hosted processing is used only where the current workflow needs it.
  • Setup is founder-led so the workflow can match real capture habits.

Claims we are not making yet

  • Fully offline transcription for every workflow.
  • Enterprise compliance controls.
  • Team-wide admin dashboards.
  • End-to-end encrypted hosted collaboration.

When local-first transcription fits.

This is a strong fit when voice capture is part of your operating system, not just a recording habit.

  • You want editable markdown instead of transcripts trapped in an app.
  • You care where source audio and generated artifacts live.
  • You use folders, notes, docs, scripts, or issues as your daily work surface.
  • You prefer a clear file workflow over a broad meeting-assistant dashboard.

Related workflow pages.

Local-first only matters when it makes the output easier to reuse.

Voice notes to markdown

See how SpeechToDo turns messy captured audio into transcript, summary, and action markdown.

Read the workflow page

Try the desktop alpha
with your own recordings.

Download the early Mac or Windows build, point it at a workspace you control, and turn recordings into markdown transcripts, summaries, and action docs.

Alpha release SpeechToDo Alpha
  • Mac notarized build and unsigned Windows ZIP
  • Local workspace and markdown outputs
  • Feedback shapes the paid beta
Cloud plans

Choose a tier inside the desktop Account panel.

Create or sign in to a SpeechToDo account in the app, then pick the Cloud tier that matches your monthly recording volume. Checkout opens with your account attached.

Cloud checkout opens from the desktop app so your subscription attaches to the right account.

Markdown iCloud Gemini Obsidian Drive Notion soon MCP soon CLI Markdown iCloud Gemini Obsidian Drive Notion soon MCP soon CLI