speak
Convert text to speech using a local or remote TTS engine.
Usage
Description
A straightforward text-to-speech utility:
Takes text from a command-line argument or your clipboard
Sends the text to a TTS server
Plays the generated audio through your speakers
Examples
# Speak from argument
agent-cli speak "Hello, world!"
# Speak from clipboard
agent-cli speak
# Save to file instead of playing
agent-cli speak "Hello" --save-file hello.wav
# List audio output devices
agent-cli speak --list-devices
Options
Provider Selection
Option
Default
Description
--tts-provider
wyoming
The TTS provider to use ('wyoming', 'openai', 'kokoro', 'gemini').
Audio Output
Option
Default
Description
--output-device-index
-
Audio output device index (see --list-devices for available devices).
--output-device-name
-
Partial match on device name (e.g., 'speakers', 'headphones').
--tts-speed
1.0
Speech speed multiplier (1.0 = normal, 2.0 = twice as fast, 0.5 = half speed).
Audio Output: Wyoming
Option
Default
Description
--tts-wyoming-ip
localhost
Wyoming TTS server IP address.
--tts-wyoming-port
10200
Wyoming TTS server port.
--tts-wyoming-voice
-
Voice name to use for Wyoming TTS (e.g., 'en_US-lessac-medium').
--tts-wyoming-language
-
Language for Wyoming TTS (e.g., 'en_US').
--tts-wyoming-speaker
-
Speaker name for Wyoming TTS voice.
Audio Output: OpenAI-compatible
Option
Default
Description
--tts-openai-model
tts-1
The OpenAI model to use for TTS.
--tts-openai-voice
alloy
Voice for OpenAI TTS (alloy, echo, fable, onyx, nova, shimmer).
--tts-openai-base-url
-
Custom base URL for OpenAI-compatible TTS API (e.g., http://localhost:8000/v1 for a proxy).
Audio Output: Kokoro
Option
Default
Description
--tts-kokoro-model
kokoro
The Kokoro model to use for TTS.
--tts-kokoro-voice
af_sky
The voice to use for Kokoro TTS.
--tts-kokoro-host
http://localhost:8880/v1
The base URL for the Kokoro API.
Audio Output: Gemini
Option
Default
Description
--tts-gemini-model
gemini-2.5-flash-preview-tts
The Gemini model to use for TTS.
--tts-gemini-voice
Kore
The voice to use for Gemini TTS (e.g., 'Kore', 'Puck', 'Charon', 'Fenrir').
LLM: Gemini
Option
Default
Description
--gemini-api-key
-
Your Gemini API key. Can also be set with the GEMINI_API_KEY environment variable.
Option
Default
Description
--list-devices
false
List available audio devices with their indices and exit.
General Options
Option
Default
Description
--save-file
-
Save audio to WAV file instead of playing through speakers.
--log-level
warning
Set logging level.
--log-file
-
Path to a file to write logs to.
--quiet, -q
false
Suppress console output from rich.
--json
false
Output result as JSON (implies --quiet and --no-clipboard).
--config
-
Path to a TOML configuration file.
--print-args
false
Print the command line arguments, including variables taken from the configuration file.
Process Management
Option
Default
Description
--stop
false
Stop any running instance of this command.
--status
false
Check if an instance is currently running.
--toggle
false
Start if not running, stop if running. Ideal for hotkey binding.
Available Voices
Wyoming (Piper)
List available voices:
# Check Piper documentation or run with verbose logging
agent-cli speak --log-level DEBUG "test"
Common voices:
en_US-lessac-medium - US English, natural
en_GB-alan-medium - British English
de_DE-thorsten-medium - German
OpenAI
alloy, echo, fable, onyx, nova, shimmer
Kokoro
af_sky, af_bella, am_adam, and more
Gemini
Kore (default), Puck, Charon, Fenrir
Use Cases
Read Clipboard Aloud
Speed Up Audio
agent-cli speak "Long text here" --tts-speed 1 .5
Save for Later
agent-cli speak "Important reminder" --save-file reminder.wav