vllm.transformers_utils.configs.mlp_speculator ¶
   MLPSpeculatorConfig ¶
  Bases: PretrainedConfig
Source code in vllm/transformers_utils/configs/mlp_speculator.py
   __init__ ¶
 __init__(
    vocab_size: int = 32000,
    emb_dim: int = 4096,
    inner_dim: int = 0,
    n_predict: int = 3,
    top_k_tokens_per_head: list[int] | None = None,
    n_candidates: int = 5,
    tie_weights: bool = False,
    scale_input: bool = False,
    **kwargs,
)
Initialize an MLPSpeculatorConfig
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
 vocab_size  |   int  |    int the model vocab size  |   32000  |  
 emb_dim  |   int  |    int the model embedding dimension  |   4096  |  
 inner_dim  |   int  |    int the inner dimension of the model. If 0, will be the emb_dim.  |   0  |  
 n_predict  |   int  |    int the number of lookaheads for the speculator  |   3  |  
 top_k_tokens_per_head  |   list[int] | None  |    list[int] Number of tokens to consider from each head when forming the candidate tree. For each candidate branch in the tree, head n produces topk[n] additional sub-branches. NOTE: This parameter is currently unused.  |   None  |  
 n_candidates  |   int  |    int number of child candidates to create per sequence  |   5  |  
 tie_weights  |   bool  |    bool If true, use a single set of weights for every model head/stage after the first. The initial projection from the base model may have a different size, so that stays separate.  |   False  |  
 scale_input  |   bool  |    bool if True, will scale the initial hidden states from the base model.  |   False  |