vllm.v1.kv_offload.backend ¶
   Backend ¶
  Bases: ABC
An abstract class for allocating and returning specs for writing KV blocks to some backend.
Source code in vllm/v1/kv_offload/backend.py
   __init__ ¶
     allocate_blocks  abstractmethod  ¶
 allocate_blocks(
    block_hashes: list[BlockHash],
) -> list[BlockStatus]
Allocate space for writing blocks. This method assumes there is enough space for allocation. It is unsafe to use without checking get_num_free_blocks beforehand.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
 block_hashes  |   list[BlockHash]  |    the hashes identifying the blocks to be written.  |  required | 
Returns:
| Type | Description | 
|---|---|
 list[BlockStatus]  |    A list of BlockStatus for the allocated blocks.  |  
 list[BlockStatus]  |    The ref_cnt of each returned item will be -1, meaning the block  |  
 list[BlockStatus]  |    is not yet ready to be read.  |  
Source code in vllm/v1/kv_offload/backend.py
   free  abstractmethod  ¶
 free(block: BlockStatus)
Free a previously allocated block. You should only call this function with blocks returned by allocate_blocks, and only once per each block.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
 block  |   BlockStatus  |    The block to be freed.  |  required | 
Source code in vllm/v1/kv_offload/backend.py
    get_load_store_spec ¶
 get_load_store_spec(
    block_hashes: Iterable[BlockHash],
    blocks: Iterable[BlockStatus],
) -> LoadStoreSpec
Get backend-specific information on how to read/write blocks.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
 block_hashes  |   Iterable[BlockHash]  |    the list of block hashes identifying the blocks.  |  required | 
 blocks  |   Iterable[BlockStatus]  |    the list of blocks.  |  required | 
Returns:
| Type | Description | 
|---|---|
 LoadStoreSpec  |    A LoadStoreSpec that can be used by a worker  |  
 LoadStoreSpec  |    to read/write the blocks.  |  
Source code in vllm/v1/kv_offload/backend.py
   get_num_free_blocks  abstractmethod  ¶
     BlockStatus ¶
  Bases: Structure
Offloading status for a single block of KV data. Holds the following information:
ref_cnt - the current number of transfers using this block as a source. A value of -1 indicates the block is not yet ready to be read. load_store_spec - backend-specific information on how to actually read/write the block.