Docker Installation

This guide explains how to run SurfSense using Docker Compose, which is the preferred and recommended method for deployment.

Prerequisites

Before you begin, ensure you have:

Docker and Docker Compose installed on your machine
Git (to clone the repository)
Completed all the prerequisite setup steps including:
- PGVector setup
- File Processing ETL Service (choose one):
  - Unstructured.io API key (Supports 34+ formats)
  - LlamaIndex API key (enhanced parsing, supports 50+ formats)
  - Docling (local processing, no API key required, supports PDF, Office docs, images, HTML, CSV)
- Other required API keys

Installation Steps

Configure Environment Variables Set up the necessary environment variables:

Linux/macOS:

# Copy example environment files
cp surfsense_backend/.env.example surfsense_backend/.env
cp surfsense_web/.env.example surfsense_web/.env
cp .env.example .env  # For Docker-specific settings

Windows (Command Prompt):

copy surfsense_backend\.env.example surfsense_backend\.env
copy surfsense_web\.env.example surfsense_web\.env
copy .env.example .env

Windows (PowerShell):

Copy-Item -Path surfsense_backend\.env.example -Destination surfsense_backend\.env
Copy-Item -Path surfsense_web\.env.example -Destination surfsense_web\.env
Copy-Item -Path .env.example -Destination .env

Edit all .env files and fill in the required values:

Docker-Specific Environment Variables

ENV VARIABLE	DESCRIPTION	DEFAULT VALUE
FRONTEND_PORT	Port for the frontend service	3000
BACKEND_PORT	Port for the backend API service	8000
POSTGRES_PORT	Port for the PostgreSQL database	5432
PGADMIN_PORT	Port for pgAdmin web interface	5050
POSTGRES_USER	PostgreSQL username	postgres
POSTGRES_PASSWORD	PostgreSQL password	postgres
POSTGRES_DB	PostgreSQL database name	surfsense
PGADMIN_DEFAULT_EMAIL	Email for pgAdmin login	admin@surfsense.com
PGADMIN_DEFAULT_PASSWORD	Password for pgAdmin login	surfsense
NEXT_PUBLIC_API_URL	URL of the backend API (used by frontend)	http://backend:8000

Backend Environment Variables:

ENV VARIABLE	DESCRIPTION
DATABASE_URL	PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`)
SECRET_KEY	JWT Secret key for authentication (should be a secure random string)
NEXT_FRONTEND_URL	URL where your frontend application is hosted (e.g., `http://localhost:3000`)
AUTH_TYPE	Authentication method: `GOOGLE` for OAuth with Google, `LOCAL` for email/password authentication
GOOGLE_OAUTH_CLIENT_ID	(Optional) Client ID from Google Cloud Console (required if AUTH_TYPE=GOOGLE)
GOOGLE_OAUTH_CLIENT_SECRET	(Optional) Client secret from Google Cloud Console (required if AUTH_TYPE=GOOGLE)
EMBEDDING_MODEL	Name of the embedding model (e.g., `mixedbread-ai/mxbai-embed-large-v1`)
RERANKERS_MODEL_NAME	Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`)
RERANKERS_MODEL_TYPE	Type of reranker model (e.g., `flashrank`)
TTS_SERVICE	Text-to-Speech API provider for Podcasts (e.g., `openai/tts-1`). See supported providers
TTS_SERVICE_API_KEY	API key for the Text-to-Speech service
TTS_SERVICE_API_BASE	(Optional) Custom API base URL for the Text-to-Speech service
STT_SERVICE	Speech-to-Text API provider for Podcasts (e.g., `openai/whisper-1`). See supported providers
STT_SERVICE_API_KEY	API key for the Speech-to-Text service
STT_SERVICE_API_BASE	(Optional) Custom API base URL for the Speech-to-Text service
FIRECRAWL_API_KEY	API key for Firecrawl service for web crawling
ETL_SERVICE	Document parsing service: `UNSTRUCTURED` (supports 34+ formats), `LLAMACLOUD` (supports 50+ formats including legacy document types), or `DOCLING` (local processing, supports PDF, Office docs, images, HTML, CSV)
UNSTRUCTURED_API_KEY	API key for Unstructured.io service for document parsing (required if ETL_SERVICE=UNSTRUCTURED)
LLAMA_CLOUD_API_KEY	API key for LlamaCloud service for document parsing (required if ETL_SERVICE=LLAMACLOUD)

Optional Backend LangSmith Observability:

ENV VARIABLE	DESCRIPTION
LANGSMITH_TRACING	Enable LangSmith tracing (e.g., `true`)
LANGSMITH_ENDPOINT	LangSmith API endpoint (e.g., `https://api.smith.langchain.com`)
LANGSMITH_API_KEY	Your LangSmith API key
LANGSMITH_PROJECT	LangSmith project name (e.g., `surfsense`)

Backend Uvicorn Server Configuration:

ENV VARIABLE	DESCRIPTION	DEFAULT VALUE
UVICORN_HOST	Host address to bind the server	0.0.0.0
UVICORN_PORT	Port to run the backend API	8000
UVICORN_LOG_LEVEL	Logging level (e.g., info, debug, warning)	info
UVICORN_PROXY_HEADERS	Enable/disable proxy headers	false
UVICORN_FORWARDED_ALLOW_IPS	Comma-separated list of allowed IPs	127.0.0.1
UVICORN_WORKERS	Number of worker processes	1
UVICORN_ACCESS_LOG	Enable/disable access log (true/false)	true
UVICORN_LOOP	Event loop implementation	auto
UVICORN_HTTP	HTTP protocol implementation	auto
UVICORN_WS	WebSocket protocol implementation	auto
UVICORN_LIFESPAN	Lifespan implementation	auto
UVICORN_LOG_CONFIG	Path to logging config file or empty string
UVICORN_SERVER_HEADER	Enable/disable Server header	true
UVICORN_DATE_HEADER	Enable/disable Date header	true
UVICORN_LIMIT_CONCURRENCY	Max concurrent connections
UVICORN_LIMIT_MAX_REQUESTS	Max requests before worker restart
UVICORN_TIMEOUT_KEEP_ALIVE	Keep-alive timeout (seconds)	5
UVICORN_TIMEOUT_NOTIFY	Worker shutdown notification timeout (sec)	30
UVICORN_SSL_KEYFILE	Path to SSL key file
UVICORN_SSL_CERTFILE	Path to SSL certificate file
UVICORN_SSL_KEYFILE_PASSWORD	Password for SSL key file
UVICORN_SSL_VERSION	SSL version
UVICORN_SSL_CERT_REQS	SSL certificate requirements
UVICORN_SSL_CA_CERTS	Path to CA certificates file
UVICORN_SSL_CIPHERS	SSL ciphers
UVICORN_HEADERS	Comma-separated list of headers
UVICORN_USE_COLORS	Enable/disable colored logs	true
UVICORN_UDS	Unix domain socket path
UVICORN_FD	File descriptor to bind to
UVICORN_ROOT_PATH	Root path for the application

For more details, see the Uvicorn documentation.

Frontend Environment Variables

ENV VARIABLE	DESCRIPTION
NEXT_PUBLIC_FASTAPI_BACKEND_URL	URL of the backend service (e.g., `http://localhost:8000`)
NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE	Same value as set in backend AUTH_TYPE i.e `GOOGLE` for OAuth with Google, `LOCAL` for email/password authentication
NEXT_PUBLIC_ETL_SERVICE	Document parsing service (should match backend ETL_SERVICE): `UNSTRUCTURED`, `LLAMACLOUD`, or `DOCLING` - affects supported file formats in upload interface

Build and Start Containers

Start the Docker containers:

Linux/macOS/Windows:
```
docker compose up --build
```
To run in detached mode (in the background):

Linux/macOS/Windows:
```
docker compose up -d
```
Note for Windows users: If you're using older Docker Desktop versions, you might need to use docker compose (with a space) instead of docker compose.
Access the Applications

Once the containers are running, you can access:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
- pgAdmin: http://localhost:5050

Using pgAdmin

pgAdmin is included in the Docker setup to help manage your PostgreSQL database. To connect:

Open pgAdmin at http://localhost:5050
Login with the credentials from your .env file (default: admin@surfsense.com / surfsense)
Right-click "Servers" > "Create" > "Server"
In the "General" tab, name your connection (e.g., "SurfSense DB")
In the "Connection" tab:
- Host: db
- Port: 5432
- Maintenance database: surfsense
- Username: postgres (or your custom POSTGRES_USER)
- Password: postgres (or your custom POSTGRES_PASSWORD)
Click "Save" to connect

Useful Docker Commands

Container Management

Stop containers:

Linux/macOS/Windows:
```
docker compose down
```

View logs:

Linux/macOS/Windows:

# All services
docker compose logs -f

# Specific service
docker compose logs -f backend
docker compose logs -f frontend
docker compose logs -f db

Restart a specific service:

Linux/macOS/Windows:
```
docker compose restart backend
```

Execute commands in a running container:

Linux/macOS/Windows:

# Backend
docker compose exec backend python -m pytest

# Frontend
docker compose exec frontend pnpm lint

Troubleshooting

Linux/macOS: If you encounter permission errors, you may need to run the docker commands with sudo.
Windows: If you see access denied errors, make sure you're running Command Prompt or PowerShell as Administrator.
If ports are already in use, modify the port mappings in the docker-compose.yml file.
For backend dependency issues, check the Dockerfile in the backend directory.
For frontend dependency issues, check the Dockerfile in the frontend directory.
Windows-specific: If you encounter line ending issues (CRLF vs LF), configure Git to handle line endings properly with git config --global core.autocrlf true before cloning the repository.

Next Steps

Once your installation is complete, you can start using SurfSense! Navigate to the frontend URL and log in using your Google account.