Continuous live transcription with voice activity detection (VAD).
Usage
agent-clitranscribe-live[OPTIONS]
Description
Runs continuously, listening to your microphone and automatically segmenting speech using voice activity detection:
Starts listening immediately
Detects when you start and stop speaking
Automatically transcribes each speech segment
Logs results with timestamps
Optionally saves audio as MP3 files
Press Ctrl+C to stop.
Segments shorter than 0.3s are discarded even if --min-segment is set lower.
Saving MP3 files requires FFmpeg; if it's not available, audio saving is disabled with a warning.
# Basic daemonagent-clitranscribe-live
# With custom roleagent-clitranscribe-live--rolemeeting
# With LLM cleanupagent-clitranscribe-live--llm
# Custom silence thresholdagent-clitranscribe-live--silence-threshold1.5
Options
Options
Option
Default
Description
--role, -r
user
Label for log entries. Use to distinguish speakers or contexts in logs.
--silence-threshold, -s
1.0
Seconds of silence after speech to finalize a segment. Increase for slower speakers.
--min-segment, -m
0.25
Minimum seconds of speech required before a segment is processed. Filters brief sounds.
--vad-threshold
0.3
Silero VAD confidence threshold (0.0-1.0). Higher values require clearer speech; lower values are more sensitive to quiet/distant voices.
--save-audio/--no-save-audio
true
Save each speech segment as MP3. Requires ffmpeg to be installed.
--audio-dir
-
Base directory for MP3 files. Files are organized by date: YYYY/MM/DD/HHMMSS_mmm.mp3. Default: ~/.config/agent-cli/audio.
--transcription-log, -t
-
JSONL file for transcript logging (one JSON object per line with timestamp, role, raw/processed text, audio path). Default: ~/.config/agent-cli/transcriptions.jsonl.
--clipboard/--no-clipboard
false
Copy each completed transcription to clipboard (overwrites previous). Useful with --llm to get cleaned text.
Provider Selection
Option
Default
Description
--asr-provider
wyoming
The ASR provider to use ('wyoming', 'openai', 'gemini').
--llm-provider
ollama
The LLM provider to use ('ollama', 'openai', 'gemini').
Audio Input
Option
Default
Description
--input-device-index
-
Audio input device index (see --list-devices). Uses system default if omitted.
--input-device-name
-
Select input device by name substring (e.g., MacBook or USB).
--list-devices
false
List available audio devices with their indices and exit.
Audio Input: Wyoming
Option
Default
Description
--asr-wyoming-ip
localhost
Wyoming ASR server IP address.
--asr-wyoming-port
10300
Wyoming ASR server port.
Audio Input: OpenAI-compatible
Option
Default
Description
--asr-openai-model
whisper-1
The OpenAI model to use for ASR (transcription).
--asr-openai-base-url
-
Custom base URL for OpenAI-compatible ASR API (e.g., for custom Whisper server: http://localhost:9898).
--asr-openai-prompt
-
Custom prompt to guide transcription (optional).
Audio Input: Gemini
Option
Default
Description
--asr-gemini-model
gemini-3-flash-preview
The Gemini model to use for ASR (transcription).
LLM: Ollama
Option
Default
Description
--llm-ollama-model
gemma3:4b
The Ollama model to use. Default is gemma3:4b.
--llm-ollama-host
http://localhost:11434
The Ollama server host. Default is http://localhost:11434.
LLM: OpenAI-compatible
Option
Default
Description
--llm-openai-model
gpt-5-mini
The OpenAI model to use for LLM tasks.
--openai-api-key
-
Your OpenAI API key. Can also be set with the OPENAI_API_KEY environment variable.
--openai-base-url
-
Custom base URL for OpenAI-compatible API (e.g., for llama-server: http://localhost:8080/v1).
LLM: Gemini
Option
Default
Description
--llm-gemini-model
gemini-3-flash-preview
The Gemini model to use for LLM tasks.
--gemini-api-key
-
Your Gemini API key. Can also be set with the GEMINI_API_KEY environment variable.
LLM Configuration
Option
Default
Description
--llm/--no-llm
false
Clean up transcript with LLM: fix errors, add punctuation, remove filler words. Uses --extra-instructions if set (via CLI or config file).
Process Management
Option
Default
Description
--stop
false
Stop any running instance of this command.
--status
false
Check if an instance is currently running.
General Options
Option
Default
Description
--log-level
warning
Set logging level.
--log-file
-
Path to a file to write logs to.
--quiet, -q
false
Suppress console output from rich.
--config
-
Path to a TOML configuration file.
--print-args
false
Print the command line arguments, including variables taken from the configuration file.
Output Files
Transcription Log
JSON Lines format at ~/.config/agent-cli/transcriptions.jsonl: