Summary¶
Configuration¶
API documentation for vLLM's configuration classes.
- vllm.config.ModelConfig
 - vllm.config.CacheConfig
 - vllm.config.LoadConfig
 - vllm.config.ParallelConfig
 - vllm.config.SchedulerConfig
 - vllm.config.DeviceConfig
 - vllm.config.SpeculativeConfig
 - vllm.config.LoRAConfig
 - vllm.config.MultiModalConfig
 - vllm.config.PoolerConfig
 - vllm.config.StructuredOutputsConfig
 - vllm.config.ObservabilityConfig
 - vllm.config.KVTransferConfig
 - vllm.config.CompilationConfig
 - vllm.config.VllmConfig
 
Offline Inference¶
LLM Class.
LLM Inputs.
vLLM Engines¶
Engine classes for offline and online inference.
Inference Parameters¶
Inference parameters for vLLM APIs.
Multi-Modality¶
vLLM provides experimental support for multi-modal models through the vllm.multimodal package.
Multi-modal inputs can be passed alongside text and token prompts to supported models via the multi_modal_data field in vllm.inputs.PromptType.
Looking to add your own multi-modal model? Please follow the instructions listed here.
Inputs¶
User-facing inputs.
Internal data structures.
- vllm.multimodal.inputs.PlaceholderRange
 - vllm.multimodal.inputs.NestedTensors
 - vllm.multimodal.inputs.MultiModalFieldElem
 - vllm.multimodal.inputs.MultiModalFieldConfig
 - vllm.multimodal.inputs.MultiModalKwargsItem
 - vllm.multimodal.inputs.MultiModalKwargsItems
 - vllm.multimodal.inputs.MultiModalKwargs
 - vllm.multimodal.inputs.MultiModalInputs