vllm.attention.selector ¶
   _cached_get_attn_backend  cached  ¶
 _cached_get_attn_backend(
    head_size: int,
    dtype: dtype,
    kv_cache_dtype: str | None,
    block_size: int,
    use_v1: bool = False,
    use_mla: bool = False,
    has_sink: bool = False,
    use_sparse: bool = False,
) -> type[AttentionBackend]
Source code in vllm/attention/selector.py
   get_attn_backend ¶
 get_attn_backend(
    head_size: int,
    dtype: dtype,
    kv_cache_dtype: str | None,
    block_size: int,
    use_mla: bool = False,
    has_sink: bool = False,
    use_sparse: bool = False,
) -> type[AttentionBackend]
Selects which attention backend to use and lazily imports it.
Source code in vllm/attention/selector.py
   get_env_variable_attn_backend ¶
 get_env_variable_attn_backend() -> _Backend | None
Get the backend override specified by the vLLM attention backend environment variable, if one is specified.
Returns:
- _Backend enum value if an override is specified
 - None otherwise
 
Source code in vllm/attention/selector.py
   get_global_forced_attn_backend ¶
 get_global_forced_attn_backend() -> _Backend | None
Get the currently-forced choice of attention backend, or None if auto-selection is currently enabled.
  global_force_attn_backend ¶
 global_force_attn_backend(
    attn_backend: _Backend | None,
) -> None
Force all attention operations to use a specified backend.
Passing None for the argument re-enables automatic backend selection.,
Arguments:
- attn_backend: backend selection (None to revert to auto)
 
Source code in vllm/attention/selector.py
   global_force_attn_backend_context_manager ¶
  Globally force a vLLM attention backend override within a context manager, reverting the global attention backend override to its prior state upon exiting the context manager.
Arguments:
- attn_backend: attention backend to force
 
Returns:
- Generator
 
Source code in vllm/attention/selector.py
   is_attn_backend_supported ¶
 is_attn_backend_supported(
    attn_backend: str | type[AttentionBackend],
    head_size: int,
    dtype: dtype,
    *,
    allow_import_error: bool = True,
) -> _IsSupported