Speaker profiles are stored voice embeddings created by transcribe --diarize or
diarize-live-session when you use --remember-unknown-speakers or
--enroll-speakers.
Use speakers list to see the stable profile IDs, speakers rename to give an
unknown profile a human name, speakers merge to fold duplicate profiles for
the same person together, and speakers review to listen to diarized snippets
and decide interactively whether each voice should be merged into an existing
profile or saved as a new named profile.
Examples
# First diarize and remember unmatched voicesagent-clidiarize-live-session--last-recording1--remember-unknown-speakers
# Inspect the remembered profilesagent-clispeakerslist
# Name one remembered profileagent-clispeakersrenameUNKNOWN_001Alice
# Merge a duplicate unknown profile into Aliceagent-clispeakersmergeUNKNOWN_002Alice
# Listen to snippets from the last saved recording and update profilesagent-clispeakersreview--last-recording1# Review a continuous transcribe-live sessionagent-clispeakersreview--last-session2# JSON output for scriptsagent-clispeakerslist--json
Notes
Pyannote labels such as SPEAKER_00 are local to one diarization run and may change between recordings.
Stored profile IDs such as UNKNOWN_001 are stable across runs.
Renaming a profile preserves its embeddings and changes the display name used by future diarization matches.
Merging moves embeddings from the source profile into the target profile and removes the source profile.
Review appends the current recording's speaker embedding to an existing profile when you choose merge.
speakers list --json shows profile metadata only; it does not print embedding vectors.
Rename Arguments
Argument
Description
IDENTIFIER
Existing profile id or name, for example UNKNOWN_001.
NAME
New display name. Quote names with spaces, for example "John Smith".
Merge Arguments
Argument
Description
SOURCE
Duplicate profile id or name to remove, for example UNKNOWN_002.
TARGET
Profile id or name to keep, for example Alice or UNKNOWN_001.
Audio player command to use for snippets (default: afplay, ffplay, aplay, or paplay).
Diarization
Option
Default
Description
--hf-token
-
HuggingFace token for pyannote models. Required for diarization. Token must have 'Read access to contents of all public gated repos you can access' permission. Accept licenses at: https://hf.co/pyannote/speaker-diarization-3.1, https://hf.co/pyannote/segmentation-3.0, https://hf.co/pyannote/wespeaker-voxceleb-resnet34-LM
--min-speakers
-
Minimum number of speakers (optional hint for diarization).
--max-speakers
-
Maximum number of speakers (optional hint for diarization).
--speaker-match-threshold
0.7
Cosine-similarity threshold for matching diarized speakers to stored profiles.