Question 1

How much audio do I need to clone a voice?

Accepted Answer

30 seconds of clean, single-speaker audio is the minimum. 5–10 minutes noticeably improves prosody and dialect consistency.

Question 2

Can the cloned voice speak English as well?

Accepted Answer

Yes. Clones synthesize fluently in both Arabic and English, preserving the speaker's timbre across languages.

Question 3

What consent and safety controls are in place?

Accepted Answer

Every clone requires a signed consent record. Watermarking is on by default and can be verified via /v1/voice-changer.

Question 4

Is voice cloning available on-premises?

Accepted Answer

Yes — the full cloning pipeline ships with the on-prem stack, including the consent store and watermark verifier.

Clone any Arabic voice from 30 seconds of audio

From sample to clone in minutes