Generate single or multi-speaker audio. For single-speaker monologues, the system automatically uses a specialized node with text chunking.
Upload a short audio clip (3-30 seconds, clear audio) for each speaker you want to clone.
More steps = better quality, but slower.
Guidance scale.
Enable for more varied, less deterministic output.
Only used when sampling is enabled.
For long single-speaker text. Splits text to avoid errors.