AnythingLLM¶
AnythingLLM is a full-stack application that enables you to turn any document, resource, or piece of content into context that any LLM can use as references during chatting.
It allows you to deploy a large language model (LLM) server with vLLM as the backend, which exposes OpenAI-compatible endpoints.
Prerequisites¶
Set up the vLLM environment:
Deploy¶
-  
Start the vLLM server with a supported chat-completion model, for example:
 -  
Download and install AnythingLLM Desktop.
 -  
Configure the AI provider:
- At the bottom, click the 🔧 wrench icon -> Open settings -> AI Providers -> LLM.
 - Enter the following values:
- LLM Provider: Generic OpenAI
 - Base URL: 
http://{vllm server host}:{vllm server port}/v1 - Chat Model Name: 
Qwen/Qwen1.5-32B-Chat-AWQ 
 
 -  
Create a workspace:
- At the bottom, click the ↺ back icon and back to workspaces.
 - Create a workspace (e.g., 
vllm) and start chatting. 
 -  
Add a document.
- Click the 📎 attachment icon.
 - Upload a document.
 - Select and move the document into your workspace.
 - Save and embed it.
 
 -  
Chat using your document as context.
 



