transcribe
Transcribe audio from your microphone to text.
Usage
agent-cli transcribe [ OPTIONS]
Description
This command:
Starts listening to your microphone immediately
Records your speech
When you press Ctrl+C, sends audio to a Whisper server
Copies the transcribed text to your clipboard
Optionally uses an LLM to clean up the transcript
Examples
# Basic transcription
agent-cli transcribe --input-device-index 1
# With LLM cleanup
agent-cli transcribe --input-device-index 1 --llm
# List available audio devices
agent-cli transcribe --list-devices
# Transcribe from a saved file
agent-cli transcribe --from-file recording.wav
# Re-transcribe most recent recording
agent-cli transcribe --last-recording 1
Options
Audio Configuration
Option
Description
--input-device-index
Index of the audio input device
--input-device-name
Device name keywords for partial matching
--list-devices
List available audio devices and exit
Audio Recovery
Option
Description
Default
--from-file PATH
Transcribe from a saved WAV file
-
--last-recording N
Transcribe Nth most recent recording (1=latest)
0
--save-recording / --no-save-recording
Save audio for recovery
true
Provider Selection
Option
Description
Default
--asr-provider
ASR provider: wyoming, openai
wyoming
--llm-provider
LLM provider: ollama, openai, gemini
ollama
Wyoming (Local) Configuration
Option
Description
Default
--asr-wyoming-ip
Wyoming ASR server IP
localhost
--asr-wyoming-port
Wyoming ASR server port
10300
OpenAI ASR Configuration
Option
Description
Default
--asr-openai-model
OpenAI ASR model
whisper-1
--asr-openai-base-url
Custom Whisper server URL
-
--asr-openai-prompt
Custom prompt to guide transcription
-
LLM (Ollama, local)
Option
Description
Default
--llm-ollama-model
Ollama model to use
gemma3:4b
--llm-ollama-host
Ollama server URL
http://localhost:11434
LLM (OpenAI)
Option
Description
Default
--llm-openai-model
OpenAI model to use
gpt-5-mini
--openai-api-key
OpenAI API key (or set OPENAI_API_KEY)
-
--openai-base-url
Custom OpenAI-compatible API URL (or set OPENAI_BASE_URL)
-
LLM (Gemini)
Option
Description
Default
--llm-gemini-model
Gemini model to use
gemini-2.5-flash
--gemini-api-key
Gemini API key (or set GEMINI_API_KEY)
-
LLM Cleanup
Option
Description
Default
--llm / --no-llm
Use LLM to process transcript
false
--extra-instructions
Additional LLM instructions
-
Process Management
Option
Description
--stop
Stop running background process
--status
Check if background process is running
--toggle
Toggle background process on/off
General Options
Option
Description
Default
--clipboard / --no-clipboard
Copy result to clipboard
true
--transcription-log PATH
Log transcriptions with timestamps
-
--log-level
Set logging level
WARNING
--log-file PATH
Path to a file to write logs to
-
--quiet, -q
Suppress console output
false
--config PATH
Path to a TOML configuration file
-
--print-args
Print resolved arguments including config values
false
Workflow Integration
Toggle Recording Hotkey
The --toggle flag is designed for hotkey integration:
# First press: starts recording
agent-cli transcribe --toggle --input-device-index 1
# Second press: stops recording and transcribes
agent-cli transcribe --toggle
macOS Hotkey (skhd)
cmd + shift + r : /path/to/agent-cli transcribe --toggle --input-device-index 1
Transcription Log
Log all transcriptions with timestamps:
agent-cli transcribe --transcription-log ~/.config/agent-cli/transcriptions.log
Tips
Use --list-devices to find your microphone's index
Enable --llm for cleaner output with proper punctuation
Use --last-recording 1 to re-transcribe if you need to adjust settings