Skip to content

Getting Started

This guide walks you through installing Agent CLI and setting up your first voice-powered workflow.

Prerequisites

Before you begin, ensure you have:

  • uv (recommended) or Python 3.11+
  • A microphone for voice features
  • Speakers for text-to-speech features

Installation

Option 1: CLI Tool Only

If you already have AI services set up or plan to use cloud services (OpenAI/Gemini):

# Using uv (recommended)
uv tool install agent-cli

# Using pip
pip install agent-cli

Option 2: Full Local Setup

For a complete local setup with all AI services:

# 1. Install agent-cli
uv tool install agent-cli

# 2. Install all required services
agent-cli install-services

# 3. Start all services
agent-cli start-services

# 4. (Optional) Set up system-wide hotkeys
agent-cli install-hotkeys
# 1. Clone the repository
git clone https://github.com/basnijholt/agent-cli.git
cd agent-cli

# 2. Run setup
./scripts/setup-macos.sh  # or setup-linux.sh

# 3. Start services
./scripts/start-all-services.sh

# 4. (Optional) Set up hotkeys
./scripts/setup-macos-hotkeys.sh  # or setup-linux-hotkeys.sh

Verify Installation

agent-cli --version
agent-cli --help

Test Your Setup

Test Autocorrect

agent-cli autocorrect "this has an eror"
# Output: this has an error

Test Transcription

# List available microphones
agent-cli transcribe --list-devices

# Start transcribing (press Ctrl+C to stop)
agent-cli transcribe --input-device-index 1

Test Text-to-Speech

agent-cli speak "Hello, world!"

Platform-Specific Guides

For detailed installation instructions, see the platform-specific guides:

Platform Guide Notes
macOS macOS Setup Full Metal GPU acceleration
Linux Linux Setup NVIDIA GPU support
NixOS NixOS Setup Declarative configuration
Windows Windows Setup WSL2 recommended
Docker Docker Setup Cross-platform

First Workflow: Voice Transcription

Here's a typical workflow for using voice transcription:

  1. Copy some text you want to respond to (e.g., an email)
  2. Press your hotkey (Cmd+Shift+R on macOS) to start recording
  3. Speak your response naturally
  4. Press the hotkey again to stop recording
  5. Paste the transcribed text wherever you need it

What's Next?