Other Local Servers
Connect local OpenAI compatible model servers
Connect Other Local Servers
Connect to llama.cpp, vLLM, LocalAI, LiteLLM Proxy, and other servers that expose OpenAI compatible routes.
SurfSense discovers models from:
/v1/modelsChat requests use the same /v1 base URL.
Pick Your Setup
Use one of these URL patterns.
SurfSense Runs in Docker
Use this when SurfSense is running from Docker and the model server is running on your computer.
http://host.docker.internal:<port>/v1Common ports:
| Server | Port |
|---|---|
| llama.cpp | 10000 |
| vLLM | 8000 |
| LocalAI | 8080 |
| LiteLLM Proxy | 4000 |
| text-generation-webui | 5000 |
SurfSense Runs Without Docker
Use this when SurfSense and the model server both run directly on the same computer.
http://localhost:<port>/v1Model Server Runs on Another Computer
Use this when the model server is running on another machine in your network.
http://<host>:<port>/v1Add the Connection
- Open Search Space Settings.
- Go to Models.
- Select OpenAI Compatible.
- Set API Base URL.
- Add an API Key only if your server requires one.
- Select the models you want to enable.
- Save the connection.
If you enter the URL without /v1, SurfSense adds /v1 for requests.
Verify
From the host:
curl http://localhost:<port>/v1/modelsFrom the SurfSense backend container:
docker compose exec backend curl http://host.docker.internal:<port>/v1/modelsA working server returns JSON with a data array.
When Not to Use This
Use the Ollama provider for Ollama. It uses native routes such as /api/tags.
Use the LM Studio provider for LM Studio. Its default URL is already set.
Troubleshooting
Endpoint returned 404
The server does not expose /v1/models.
Enable the server's OpenAI compatible mode.
Connection refused
The backend cannot reach the server.
Check that the server is running and that the port is open.
No models found
The server returned an empty model list.
Load or serve a model, then refresh model discovery in SurfSense.