What Are Speakers?

Vozo automatically assigns speaker tags (e.g., Speaker 1, Speaker 2) based on voice characteristics. These tags help distinguish each person’s lines for translation, dubbing, and voice cloning.

How Speakers Are Detected

Vozo analyzes voice features like tone, pitch, and timing to detect different speakers. While detection is automatic, you can manually correct tags if needed.

When to Manually Correct Speakers

Update or adjust speaker tags in these cases:

  • A single speaker is mistakenly split into multiple tags.
  • Multiple speakers are grouped under the same tag.
  • A speaker’s voice varies dramatically in emotion (e.g., calm vs. angry). For lines with a distinct emotional tone, create a new speaker to help Vozo generate a more appropriate cloned voice.

How to Add or Edit Speakers

1

Click the Speaker Tag

For the segment you want to fix, click the current speaker tag.

2

Select or Create Speaker

Choose the correct speaker from the dropdown, or click New Speaker to add a new one.

3

Update Dubbing

Once all corrections are made, click Update Dubbing in the top right corner of the Speech section to apply the changes. A new cloned voice will be created for any newly added speakers.

Tips for Managing Speakers

  • Rename speakers for clarity (e.g., “Host”, “Narrator”, “Guest”).

  • Use the Filter tool at the top-left of the Speech section to view and edit one speaker’s segments at a time.

FAQ