vllm.v1.attention.backends ΒΆ
 Modules:
| Name | Description | 
|---|---|
cpu_attn |    |  
flash_attn |    Attention layer with FlashAttention.  |  
flashinfer |    Attention layer with FlashInfer.  |  
flex_attention |    Attention layer with FlexAttention.  |  
gdn_attn |    Backend for GatedDeltaNet attention.  |  
linear_attn |    |  
mamba1_attn |    |  
mamba2_attn |    |  
mamba_attn |    |  
mla |    |  
pallas |    |  
rocm_aiter_fa |    Attention layer with AiterFlashAttention.  |  
rocm_aiter_unified_attn |    Attention layer with PagedAttention and Triton prefix prefill.  |  
rocm_attn |    Attention layer with PagedAttention and Triton prefix prefill.  |  
short_conv_attn |    |  
tree_attn |    Attention layer with TreeAttention.  |  
triton_attn |    High-Performance Triton-only Attention layer.  |  
utils |    |  
xformers |    Attention layer with XFormersAttention.  |