vllm.attention.ops ¶

Modules:

Name	Description
`chunked_prefill_paged_decode`
`common`
`flashmla`
`merge_attn_states`
`paged_attn`
`pallas_kv_cache_update`
`prefix_prefill`
`rocm_aiter_mla`
`rocm_aiter_paged_attn`
`triton_decode_attention`	Memory-efficient attention for decoding.
`triton_flash_attention`	Fused Attention
`triton_merge_attn_states`
`triton_reshape_and_cache_flash`
`triton_unified_attention`
`vit_attn_wrappers`	This file contains ops for ViT attention to be compatible with torch.compile