When you press Ctrl+C, stops recording and finalizes transcription (Wyoming streams live; OpenAI uploads after stop)
Copies the transcribed text to your clipboard
Optionally uses an LLM to clean up the transcript
Examples
# Basic transcriptionagent-clitranscribe--input-device-index1# With LLM cleanupagent-clitranscribe--input-device-index1--llm
# List available audio devicesagent-clitranscribe--list-devices
# Transcribe from a saved file (supports wav, mp3, m4a, ogg, flac, aac, webm)agent-clitranscribe--from-filerecording.wav
# Transcribe an MP3 file with OpenAIagent-clitranscribe--from-filepodcast.mp3--asr-provideropenai
# Transcribe an M4A voice memo with Geminiagent-clitranscribe--from-filevoice_memo.m4a--asr-providergemini
# Re-transcribe most recent recordingagent-clitranscribe--last-recording1
Supported Audio Formats
The --from-file option supports multiple audio formats:
Provider
Supported Formats
OpenAI
mp3, mp4, mpeg, mpga, m4a, wav, webm
Gemini
wav, mp3, aiff, aac, ogg, flac, m4a
Wyoming
Any format (converted via ffmpeg)
Note
For non-WAV formats with the Wyoming provider, ffmpeg must be installed on your system.
Options
LLM Configuration
Option
Default
Description
--extra-instructions
-
Extra instructions appended to the LLM cleanup prompt (requires --llm).
--llm/--no-llm
false
Clean up transcript with LLM: fix errors, add punctuation, remove filler words. Uses --extra-instructions if set (via CLI or config file).
Audio Recovery
Option
Default
Description
--from-file
-
Transcribe from audio file instead of microphone. Supports wav, mp3, m4a, ogg, flac, aac, webm. Requires ffmpeg for non-WAV formats with Wyoming.
--last-recording
0
Re-transcribe a saved recording (1=most recent, 2=second-to-last, etc). Useful after connection failures or to retry with different options.
--save-recording/--no-save-recording
true
Save recordings to ~/.cache/agent-cli/ for --last-recording recovery.
Provider Selection
Option
Default
Description
--asr-provider
wyoming
The ASR provider to use ('wyoming', 'openai', 'gemini').
--llm-provider
ollama
The LLM provider to use ('ollama', 'openai', 'gemini').
Audio Input
Option
Default
Description
--input-device-index
-
Audio input device index (see --list-devices). Uses system default if omitted.
--input-device-name
-
Select input device by name substring (e.g., MacBook or USB).
--list-devices
false
List available audio devices with their indices and exit.
Audio Input: Wyoming
Option
Default
Description
--asr-wyoming-ip
localhost
Wyoming ASR server IP address.
--asr-wyoming-port
10300
Wyoming ASR server port.
Audio Input: OpenAI-compatible
Option
Default
Description
--asr-openai-model
whisper-1
The OpenAI model to use for ASR (transcription).
--asr-openai-base-url
-
Custom base URL for OpenAI-compatible ASR API (e.g., for custom Whisper server: http://localhost:9898).
--asr-openai-prompt
-
Custom prompt to guide transcription (optional).
Audio Input: Gemini
Option
Default
Description
--asr-gemini-model
gemini-3-flash-preview
The Gemini model to use for ASR (transcription).
LLM: Ollama
Option
Default
Description
--llm-ollama-model
gemma3:4b
The Ollama model to use. Default is gemma3:4b.
--llm-ollama-host
http://localhost:11434
The Ollama server host. Default is http://localhost:11434.
LLM: OpenAI-compatible
Option
Default
Description
--llm-openai-model
gpt-5-mini
The OpenAI model to use for LLM tasks.
--openai-api-key
-
Your OpenAI API key. Can also be set with the OPENAI_API_KEY environment variable.
--openai-base-url
-
Custom base URL for OpenAI-compatible API (e.g., for llama-server: http://localhost:8080/v1).
LLM: Gemini
Option
Default
Description
--llm-gemini-model
gemini-3-flash-preview
The Gemini model to use for LLM tasks.
--gemini-api-key
-
Your Gemini API key. Can also be set with the GEMINI_API_KEY environment variable.
Process Management
Option
Default
Description
--stop
false
Stop any running instance of this command.
--status
false
Check if an instance is currently running.
--toggle
false
Start if not running, stop if running. Ideal for hotkey binding.
General Options
Option
Default
Description
--clipboard/--no-clipboard
true
Copy result to clipboard.
--log-level
warning
Set logging level.
--log-file
-
Path to a file to write logs to.
--quiet, -q
false
Suppress console output from rich.
--json
false
Output result as JSON (implies --quiet and --no-clipboard).
--config
-
Path to a TOML configuration file.
--print-args
false
Print the command line arguments, including variables taken from the configuration file.
--transcription-log
-
Append transcripts to JSONL file (timestamp, hostname, model, raw/processed text). Recent entries provide context for LLM cleanup.
Workflow Integration
Toggle Recording Hotkey
The --toggle flag is designed for hotkey integration:
# First press: starts recordingagent-clitranscribe--toggle--input-device-index1# Second press: stops recording and transcribesagent-clitranscribe--toggle