Parameter Description

TTS Parameters

text: The input text for inference, it's suggested to URL encode the text.
character: Name of the character's folder, case-sensitive, should match the character's name in the system.
emotion: Character emotion, must be an emotion actually supported by the character, otherwise, the default emotion will be used.
text_language: Language of the text (Chinese, English, Japanese, mixed Chinese-English, mixed Japanese-English, multi-language mix), default is multi-language mix.
- Language options:
  - "auto" - Automatically recognize and segment multi-language (default).
  - "en" - Recognize entirely as English.
  - "all_ja" - Recognize entirely as Japanese.
  - "zh" - Recognize as mixed Chinese-English.
  - "ja" - Recognize as mixed Japanese-English.
  - "all_zh" - Recognize entirely as Chinese.
batch_size: Number of batches processed in parallel at one time, default is 1. Increasing this parameter can significantly speed up inference (e.g., setting it to 10) but will also consume more GPU resources. Note: May incur additional costs.
speed: Speed of speech, default is 1.0.
save_temp: Whether to save temporary files, set to true to save generated audio files; subsequent identical requests will return the saved data immediately without waiting for inference, default is false.
stream: Whether to stream the audio, if set to true, the audio will be returned sentence by sentence, default is false.
format: Format, default is wav, supported formats include wav, mp3, and flac.

Account Parameters

uid: User identification.
session: Session ID obtained after logging in.

PreviousStreaming Method NextBest Practice

Last updated 10 months ago

Was this helpful?

Parameter Description

TTS Parameters

text: The input text for inference, it's suggested to URL encode the text.
character: Name of the character's folder, case-sensitive, should match the character's name in the system.
emotion: Character emotion, must be an emotion actually supported by the character, otherwise, the default emotion will be used.
text_language: Language of the text (Chinese, English, Japanese, mixed Chinese-English, mixed Japanese-English, multi-language mix), default is multi-language mix.
- Language options:
  - "auto" - Automatically recognize and segment multi-language (default).
  - "en" - Recognize entirely as English.
  - "all_ja" - Recognize entirely as Japanese.
  - "zh" - Recognize as mixed Chinese-English.
  - "ja" - Recognize as mixed Japanese-English.
  - "all_zh" - Recognize entirely as Chinese.
batch_size: Number of batches processed in parallel at one time, default is 1. Increasing this parameter can significantly speed up inference (e.g., setting it to 10) but will also consume more GPU resources. Note: May incur additional costs.
speed: Speed of speech, default is 1.0.
save_temp: Whether to save temporary files, set to true to save generated audio files; subsequent identical requests will return the saved data immediately without waiting for inference, default is false.
stream: Whether to stream the audio, if set to true, the audio will be returned sentence by sentence, default is false.
format: Format, default is wav, supported formats include wav, mp3, and flac.

Account Parameters

uid: User identification.
session: Session ID obtained after logging in.

PreviousStreaming Method NextBest Practice

Last updated 10 months ago

Was this helpful?