Ollama + LiteLLM
Overview
Skyvern can use local models via Ollama or any OpenAI-compatible endpoint (e.g., LiteLLM). Two paths:
- (A) Direct Ollama — use the Ollama API (
/v1/chat/completions) - (B) OpenAI-compatible (LiteLLM) — Skyvern connects to a proxy that exposes an OpenAI-style API
A) Direct Ollama
1) Start Ollama locally
Install Ollama and run a model (example: llama3.1):
The API is usually at http://localhost:11434.
2) Configure Skyvern (ENV)
Add to your .env:
Note: Ollama may not support
max_completion_tokens— Skyvern handles this internally.
B) OpenAI-compatible via LiteLLM
1) Run LiteLLM as a proxy
Minimal example (see LiteLLM docs for more options):
2) Configure Skyvern (ENV)
Add to your .env:
Start Skyvern (local)
After setting the environment variables in .env, start backend + UI:
Open the UI and pick the model (or keep default); if only Ollama/LiteLLM are enabled, Skyvern will use that.
Verify your setup
Before starting Skyvern, quickly verify that your LLM endpoint is reachable.
Ollama
LiteLLM (OpenAI-compatible)
If your model doesn’t appear, re-check the proxy flags and your
.envvalues (OPENAI_COMPATIBLE_API_BASE,OPENAI_COMPATIBLE_MODEL_NAME, etc.).
Troubleshooting
- Model not responding / timeout: ensure
ollama serveis running andOLLAMA_MODELexists (ollama list). - LiteLLM 401: set
OPENAI_COMPATIBLE_API_KEYto a value accepted by the proxy or disable auth on the proxy. - CORS / wrong base URL: confirm
OPENAI_COMPATIBLE_API_BASEand that it ends with/v1. - Model not visible: ensure
ENABLE_OLLAMA=trueorENABLE_OPENAI_COMPATIBLE=truein.env, then restart services.
Internal References
- Ollama vars:
ENABLE_OLLAMA,OLLAMA_SERVER_URL,OLLAMA_MODEL - OpenAI-compatible vars:
ENABLE_OPENAI_COMPATIBLE,OPENAI_COMPATIBLE_MODEL_NAME,OPENAI_COMPATIBLE_API_KEY,OPENAI_COMPATIBLE_API_BASE,OPENAI_COMPATIBLE_API_VERSION,OPENAI_COMPATIBLE_SUPPORTS_VISION,OPENAI_COMPATIBLE_REASONING_EFFORT

