mstar.model.orpheus.config#

Classes

OrpheusModelConfig(num_hidden_layers, ...)

class mstar.model.orpheus.config.OrpheusModelConfig(num_hidden_layers: int = 28, num_attention_heads: int = 24, num_key_value_heads: int = 8, hidden_size: int = 3072, head_dim: int = 128, intermediate_size: int = 8192, max_position_embeddings: int = 131072, rms_norm_eps: float = 1e-05, rope_theta: float = 500000.0, vocab_size: int = 156940, rope_scaling: dict = <factory>, start_token_id: int = 128259, end_token_ids: list[int] = <factory>, stop_token_id: int = 128258, pad_token_id: int = 128263, custom_token_base_id: int = 128256, snac_model_id: str = 'hubertsiuzdak/snac_24khz', tokens_per_frame: int = 7, sample_rate: int = 24000, snac_window_tokens: int = 28, snac_stride_tokens: int = 7, snac_audio_slice_start: int = 2048, snac_audio_slice_end: int = 4096, temperature: float = 0.6, top_p: float = 0.8, repetition_penalty: float = 1.3, ignore_eos: bool = False, max_new_tokens: int = 4096, available_voices: list[str] = <factory>)[source]#

Bases: object

Parameters:
  • num_hidden_layers (int)

  • num_attention_heads (int)

  • num_key_value_heads (int)

  • hidden_size (int)

  • head_dim (int)

  • intermediate_size (int)

  • max_position_embeddings (int)

  • rms_norm_eps (float)

  • rope_theta (float)

  • vocab_size (int)

  • rope_scaling (dict)

  • start_token_id (int)

  • end_token_ids (list[int])

  • stop_token_id (int)

  • pad_token_id (int)

  • custom_token_base_id (int)

  • snac_model_id (str)

  • tokens_per_frame (int)

  • sample_rate (int)

  • snac_window_tokens (int)

  • snac_stride_tokens (int)

  • snac_audio_slice_start (int)

  • snac_audio_slice_end (int)

  • temperature (float)

  • top_p (float)

  • repetition_penalty (float)

  • ignore_eos (bool)

  • max_new_tokens (int)

  • available_voices (list[str])

available_voices: list[str]#
custom_token_base_id: int = 128256#
end_token_ids: list[int]#
head_dim: int = 128#
hidden_size: int = 3072#
ignore_eos: bool = False#
intermediate_size: int = 8192#
max_new_tokens: int = 4096#
max_position_embeddings: int = 131072#
num_attention_heads: int = 24#
num_hidden_layers: int = 28#
num_key_value_heads: int = 8#
pad_token_id: int = 128263#
repetition_penalty: float = 1.3#
rms_norm_eps: float = 1e-05#
rope_scaling: dict#
rope_theta: float = 500000.0#
sample_rate: int = 24000#
snac_audio_slice_end: int = 4096#
snac_audio_slice_start: int = 2048#
snac_model_id: str = 'hubertsiuzdak/snac_24khz'#
snac_stride_tokens: int = 7#
snac_window_tokens: int = 28#
start_token_id: int = 128259#
stop_token_id: int = 128258#
temperature: float = 0.6#
tokens_per_frame: int = 7#
top_p: float = 0.8#
vocab_size: int = 156940#