A hands-free voice assistant that activates on a wake word.
Usage
agent-cliassistant[OPTIONS]
Description
This agent continuously listens for a wake word (e.g., "Hey Nabu"):
Run the command—it starts listening for the wake word
Say the wake word to start recording
Speak your command or question
Say the wake word again to stop recording
The agent transcribes, sends to the LLM, and speaks the response
Immediately returns to listening for the wake word
Examples
# Start with default wake wordagent-cliassistant--input-device-index1# With custom wake wordagent-cliassistant--wake-word"hey_jarvis"--input-device-index1# With TTS responsesagent-cliassistant--tts--input-device-index1# Custom wake word serveragent-cliassistant--wake-server-ip192.168.1.100--wake-server-port10400
Options
Provider Selection
Option
Default
Description
--asr-provider
wyoming
The ASR provider to use ('wyoming', 'openai', 'gemini').
--llm-provider
ollama
The LLM provider to use ('ollama', 'openai', 'gemini').
--tts-provider
wyoming
The TTS provider to use ('wyoming', 'openai', 'kokoro', 'gemini').
Wake Word
Option
Default
Description
--wake-server-ip
localhost
Wyoming wake word server IP (requires wyoming-openwakeword or similar).
--wake-server-port
10400
Wyoming wake word server port.
--wake-word
ok_nabu
Wake word to detect. Common options: ok_nabu, hey_jarvis, alexa. Must match a model loaded in your wake word server.
Audio Input
Option
Default
Description
--input-device-index
-
Audio input device index (see --list-devices). Uses system default if omitted.
--input-device-name
-
Select input device by name substring (e.g., MacBook or USB).
--list-devices
false
List available audio devices with their indices and exit.
Audio Input: Wyoming
Option
Default
Description
--asr-wyoming-ip
localhost
Wyoming ASR server IP address.
--asr-wyoming-port
10300
Wyoming ASR server port.
Audio Input: OpenAI-compatible
Option
Default
Description
--asr-openai-model
whisper-1
The OpenAI model to use for ASR (transcription).
Audio Input: Gemini
Option
Default
Description
--asr-gemini-model
gemini-3-flash-preview
The Gemini model to use for ASR (transcription).
LLM: Ollama
Option
Default
Description
--llm-ollama-model
gemma3:4b
The Ollama model to use. Default is gemma3:4b.
--llm-ollama-host
http://localhost:11434
The Ollama server host. Default is http://localhost:11434.
LLM: OpenAI-compatible
Option
Default
Description
--llm-openai-model
gpt-5-mini
The OpenAI model to use for LLM tasks.
--openai-api-key
-
Your OpenAI API key. Can also be set with the OPENAI_API_KEY environment variable.
--openai-base-url
-
Custom base URL for OpenAI-compatible API (e.g., for llama-server: http://localhost:8080/v1).
LLM: Gemini
Option
Default
Description
--llm-gemini-model
gemini-3-flash-preview
The Gemini model to use for LLM tasks.
--gemini-api-key
-
Your Gemini API key. Can also be set with the GEMINI_API_KEY environment variable.
Audio Output
Option
Default
Description
--tts/--no-tts
false
Enable text-to-speech for responses.
--output-device-index
-
Audio output device index (see --list-devices for available devices).
--output-device-name
-
Partial match on device name (e.g., 'speakers', 'headphones').
Voice name to use for Wyoming TTS (e.g., 'en_US-lessac-medium').
--tts-wyoming-language
-
Language for Wyoming TTS (e.g., 'en_US').
--tts-wyoming-speaker
-
Speaker name for Wyoming TTS voice.
Audio Output: OpenAI-compatible
Option
Default
Description
--tts-openai-model
tts-1
The OpenAI model to use for TTS.
--tts-openai-voice
alloy
Voice for OpenAI TTS (alloy, echo, fable, onyx, nova, shimmer).
--tts-openai-base-url
-
Custom base URL for OpenAI-compatible TTS API (e.g., http://localhost:8000/v1 for a proxy).
Audio Output: Kokoro
Option
Default
Description
--tts-kokoro-model
kokoro
The Kokoro model to use for TTS.
--tts-kokoro-voice
af_sky
The voice to use for Kokoro TTS.
--tts-kokoro-host
http://localhost:8880/v1
The base URL for the Kokoro API.
Audio Output: Gemini
Option
Default
Description
--tts-gemini-model
gemini-2.5-flash-preview-tts
The Gemini model to use for TTS.
--tts-gemini-voice
Kore
The voice to use for Gemini TTS (e.g., 'Kore', 'Puck', 'Charon', 'Fenrir').
Process Management
Option
Default
Description
--stop
false
Stop any running instance of this command.
--status
false
Check if an instance is currently running.
--toggle
false
Start if not running, stop if running. Ideal for hotkey binding.
General Options
Option
Default
Description
--save-file
-
Save audio to WAV file instead of playing through speakers.
--clipboard/--no-clipboard
true
Copy result to clipboard.
--log-level
warning
Set logging level.
--log-file
-
Path to a file to write logs to.
--quiet, -q
false
Suppress console output from rich.
--config
-
Path to a TOML configuration file.
--print-args
false
Print the command line arguments, including variables taken from the configuration file.
Available Wake Words
Available wake words depend on which models you preload. The provided scripts preload ok_nabu by default.
Common models include:
ok_nabu (default in provided scripts)
hey_jarvis
alexa
Add more models via --preload-model when starting OpenWakeWord.
Custom wake words can be trained and added to the OpenWakeWord server.
Interaction Flow
┌─────────────────────────────────────────┐
│ Listening for wake word │
│ "ok_nabu" │
└───────────────────┬─────────────────────┘
│ Wake word detected
▼
┌─────────────────────────────────────────┐
│ Recording speech │
│ (speak your question) │
└───────────────────┬─────────────────────┘
│ Wake word again
▼
┌─────────────────────────────────────────┐
│ Transcribe → LLM → TTS (if enabled) │
└───────────────────┬─────────────────────┘
│
▼
Back to listening
Tips
Speak clearly after the wake word is detected
Wait for the TTS response to finish before saying the wake word again
Use --tts for a more natural conversation experience