A voice-powered clipboard assistant that edits text based on spoken commands.
Usage
agent-clivoice-edit[OPTIONS]
Description
This command is designed for a hotkey-driven workflow to act on text you've already copied:
Copy a block of text to your clipboard (e.g., an email draft)
Press a hotkey to start the agent—it begins listening
Speak a command: "Make this more formal" or "Summarize the key points"
Press the hotkey again to stop recording
The agent transcribes your command, sends it with the clipboard text to the LLM
The result is copied back to your clipboard
Optionally speaks the result if --tts is enabled
Examples
# Run in foregroundagent-clivoice-edit--input-device-index1# Run in background (for hotkey integration)agent-clivoice-edit--input-device-index1&# With text-to-speech responseagent-clivoice-edit--tts
# Check statusagent-clivoice-edit--status
# Stop background processagent-clivoice-edit--stop
Options
Provider Selection
Option
Default
Description
--asr-provider
wyoming
The ASR provider to use ('wyoming', 'openai', 'gemini').
--llm-provider
ollama
The LLM provider to use ('ollama', 'openai', 'gemini').
--tts-provider
wyoming
The TTS provider to use ('wyoming', 'openai', 'kokoro', 'gemini').
Audio Input
Option
Default
Description
--input-device-index
-
Audio input device index (see --list-devices). Uses system default if omitted.
--input-device-name
-
Select input device by name substring (e.g., MacBook or USB).
--list-devices
false
List available audio devices with their indices and exit.
Audio Input: Wyoming
Option
Default
Description
--asr-wyoming-ip
localhost
Wyoming ASR server IP address.
--asr-wyoming-port
10300
Wyoming ASR server port.
Audio Input: OpenAI-compatible
Option
Default
Description
--asr-openai-model
whisper-1
The OpenAI model to use for ASR (transcription).
Audio Input: Gemini
Option
Default
Description
--asr-gemini-model
gemini-3-flash-preview
The Gemini model to use for ASR (transcription).
LLM: Ollama
Option
Default
Description
--llm-ollama-model
gemma3:4b
The Ollama model to use. Default is gemma3:4b.
--llm-ollama-host
http://localhost:11434
The Ollama server host. Default is http://localhost:11434.
LLM: OpenAI-compatible
Option
Default
Description
--llm-openai-model
gpt-5-mini
The OpenAI model to use for LLM tasks.
--openai-api-key
-
Your OpenAI API key. Can also be set with the OPENAI_API_KEY environment variable.
--openai-base-url
-
Custom base URL for OpenAI-compatible API (e.g., for llama-server: http://localhost:8080/v1).
LLM: Gemini
Option
Default
Description
--llm-gemini-model
gemini-3-flash-preview
The Gemini model to use for LLM tasks.
--gemini-api-key
-
Your Gemini API key. Can also be set with the GEMINI_API_KEY environment variable.
Audio Output
Option
Default
Description
--tts/--no-tts
false
Enable text-to-speech for responses.
--output-device-index
-
Audio output device index (see --list-devices for available devices).
--output-device-name
-
Partial match on device name (e.g., 'speakers', 'headphones').