vllm.v1.attention.backends ¶

Modules:

Name	Description
`cpu_attn`
`flash_attn`	Attention layer with FlashAttention.
`flashinfer`	Attention layer with FlashInfer.
`flex_attention`	Attention layer with FlexAttention.
`gdn_attn`	Backend for GatedDeltaNet attention.
`linear_attn`
`mamba1_attn`
`mamba2_attn`
`mamba_attn`
`mla`
`pallas`
`rocm_aiter_fa`	Attention layer with AiterFlashAttention.
`rocm_aiter_unified_attn`	Attention layer with PagedAttention and Triton prefix prefill.
`rocm_attn`	Attention layer with PagedAttention and Triton prefix prefill.
`short_conv_attn`
`tree_attn`	Attention layer with TreeAttention.
`triton_attn`	High-Performance Triton-only Attention layer.
`utils`
`xformers`	Attention layer with XFormersAttention.