vllm.model_executor.layers.logits_processor ¶
 A layer that compute logits from hidden_stats.
  LogitsProcessor ¶
  Bases: CustomOp
Process logits and apply logits processors from sampling metadata.
This layer does the following: 1. Gather logits from model hidden_states. 2. Scale logits if needed. 3. Apply logits processors (if any).
Source code in vllm/model_executor/layers/logits_processor.py
   __init__ ¶
 __init__(
    vocab_size: int,
    org_vocab_size: int | None = None,
    scale: float = 1.0,
    logits_as_input: bool = False,
    soft_cap: float | None = None,
) -> None
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
 scale  |   float  |    A scaling factor to apply to the logits.  |   1.0  |  
Source code in vllm/model_executor/layers/logits_processor.py
   _gather_logits ¶
  gather/all-gather the logits tensor across model parallel group.
Source code in vllm/model_executor/layers/logits_processor.py
   _get_logits ¶
 _get_logits(
    hidden_states: Tensor,
    lm_head: VocabParallelEmbedding,
    embedding_bias: Tensor | None,
) -> Tensor | None
Source code in vllm/model_executor/layers/logits_processor.py
   forward ¶
 forward(
    lm_head: VocabParallelEmbedding,
    hidden_states: Tensor,
    embedding_bias: Tensor | None = None,
) -> Tensor | None