How to Set Up Open WebUI with Ollama (Complete Guide)

Ollama runs models efficiently from the command line, but most people find a chat interface far more practical for day-to-day use. Open WebUI is the best local frontend for Ollama — it gives you a ChatGPT-style interface, conversation history, model switching, file uploads, and multimodal support, all running on your own machine with no data leaving it. This guide covers installation, setup, and the most useful features.

What Is Open WebUI?

Open WebUI (formerly Ollama WebUI) is an open-source web interface designed specifically to work with Ollama and other local LLM backends. It runs as a Docker container or a Python app on your machine and connects to Ollama’s API on localhost. Once running, you access it through your browser at http://localhost:3000. It supports multiple users, persistent conversation history, model management, RAG (retrieval-augmented generation) with document uploads, image generation integration, and web search — all without any cloud dependency.

Prerequisites

You need Ollama installed and at least one model pulled before setting up Open WebUI. If you have not done that yet:

# Install Ollama (macOS/Linux)
curl -fsSL https://ollama.com/install.sh | sh

# Pull a model to use
ollama pull llama3.2

# Verify Ollama is running
curl http://localhost:11434/api/tags

Installation: Docker (Recommended)

Docker is the easiest way to install Open WebUI. It handles all dependencies and makes updating simple.

# Install Docker Desktop if you do not have it:
# https://www.docker.com/products/docker-desktop

# Run Open WebUI — connects to Ollama on the host machine
docker run -d \
  -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Open your browser to http://localhost:3000. On first load you will be prompted to create an admin account — this is local only, not connected to any external service. After signing in you will see the chat interface with all your Ollama models available in the model dropdown.

Installation: Without Docker (pip)

If you prefer not to use Docker, Open WebUI can be installed directly with pip:

# Requires Python 3.11+
pip install open-webui

# Start the server
open-webui serve

# Open http://localhost:8080 in your browser

Connecting to Ollama

Open WebUI detects Ollama automatically if it is running on the default port (11434). To verify the connection, go to Settings → Connections in the UI. You should see Ollama listed as connected with a green indicator. If it shows disconnected, check that Ollama is running (ollama serve or the Ollama app), and that the URL is set to http://host.docker.internal:11434 (Docker) or http://localhost:11434 (pip install).

# If Ollama is not auto-detected, start it explicitly
ollama serve

# On Linux with Docker, you may need to set the host environment variable
docker run -d \
  -p 3000:8080 \
  -e OLLAMA_BASE_URL=http://172.17.0.1:11434 \
  -v open-webui:/app/backend/data \
  --name open-webui \
  ghcr.io/open-webui/open-webui:main

Key Features Worth Knowing

Model switching: Select any Ollama model from the dropdown at the top of the chat. You can also run multiple models in the same conversation using the arena mode — send the same message to two models and compare responses side by side.

System prompts and personas: Click the model name to set a system prompt for the current conversation, or go to Workspace → Models to create saved personas with preset system prompts, parameters, and profile images that appear in the model selector alongside your Ollama models.

Document uploads and RAG: Upload PDFs, text files, or web URLs directly in the chat. Open WebUI chunks and embeds the documents using a local embedding model, then retrieves relevant chunks to include in the context when you ask questions. This is fully local — documents never leave your machine.

Conversation history: All conversations are stored in a local SQLite database. You can search, rename, delete, and export conversations from the sidebar. The history persists across browser sessions and Docker container restarts because it is stored in the mounted volume (open-webui:/app/backend/data).

Pulling and Managing Models from the UI

You can pull new Ollama models directly from Open WebUI without touching the command line. Go to Settings → Models and type the model name in the pull field — for example qwen2.5-coder:7b — and click the download button. The download progress appears in the UI. This is the same as running ollama pull but more convenient when you are already in the interface.

Updating Open WebUI

# Stop and remove the current container
docker stop open-webui
docker rm open-webui

# Pull the latest image
docker pull ghcr.io/open-webui/open-webui:main

# Restart with the same run command — your data volume is preserved
docker run -d \
  -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Your conversation history, user accounts, and custom model configurations are stored in the Docker volume and are preserved across updates.

Accessing Open WebUI on Other Devices on Your Network

By default Open WebUI listens on all interfaces, so other devices on your local network can access it by navigating to your machine’s local IP address instead of localhost. Find your machine’s local IP (ipconfig on Windows, ifconfig or ip addr on Linux/Mac), then access it from another device at http://192.168.x.x:3000. This lets you use your GPU-equipped desktop as a shared local LLM server for other machines on your home or office network — useful if you want to run inference on a powerful desktop but access it from a laptop.

# Find your local IP
# macOS/Linux:
ifconfig | grep 'inet ' | grep -v 127.0.0.1

# Windows:
ipconfig | findstr IPv4

# Other devices access via:
# http://YOUR_LOCAL_IP:3000

Useful Settings to Configure

A few settings in Open WebUI that make a meaningful difference to the experience. Under Settings → Interface, turn on Rich Text Input if you paste code frequently — it adds syntax highlighting to code blocks in the input box. Under Settings → Models, set a default model so the interface opens on your preferred model rather than prompting you to select one every session. Under Admin Panel → Settings → General, you can disable user registration if you are running this as a single-user setup and do not want the login screen — though keep in mind this removes all access control, so only do this on a machine not exposed to untrusted networks.

Troubleshooting Common Issues

The most common issue is Open WebUI showing Ollama as disconnected after installation. This almost always means Ollama is not running, or there is a network configuration issue between the Docker container and the host. The quickest check is to run curl http://localhost:11434/api/tags from your terminal — if that returns a list of models, Ollama is running and the issue is the Docker networking. Try setting OLLAMA_HOST=0.0.0.0 as an environment variable before starting Ollama, which makes it listen on all interfaces rather than just localhost, so the Docker container can reach it.

# Make Ollama listen on all interfaces (needed for some Docker setups)
OLLAMA_HOST=0.0.0.0 ollama serve

# Or set it permanently in your shell profile
export OLLAMA_HOST=0.0.0.0

# On macOS with the Ollama app, set it in launchd:
launchctl setenv OLLAMA_HOST 0.0.0.0

If conversations are slow to save or the UI feels sluggish, it is usually the SQLite database growing large from many long conversations. You can export and delete old conversations from the sidebar to keep the database size manageable. For very heavy use, Open WebUI also supports PostgreSQL as a backend database — see the documentation for the configuration steps if you need better performance at scale.

Open WebUI is actively developed and releases updates frequently. The GitHub repository (github.com/open-webui/open-webui) is the best place to track new features and check for known issues with specific Ollama or Docker versions. For most single-user local setups, the default Docker installation above is all you need — it handles model switching, conversation history, and document uploads well out of the box without any additional configuration.

Using Open WebUI for RAG: A Practical Walkthrough

One of the most useful features in Open WebUI is the built-in RAG pipeline. You can upload a PDF, paste a URL, or add a text file directly in the chat, and the model will answer questions about it using the document content as context. This is entirely local — the document is chunked, embedded using a local embedding model (nomic-embed-text by default), and stored in a local vector database. Nothing leaves your machine.

To use it, click the paperclip icon in the chat input, select your file, and wait for the processing indicator to finish. Once processed, ask questions about the document in the same chat window. For best results with long PDFs, set a higher num_ctx for the model you are using — go to Settings → Models, find your model, and increase the context length to at least 16384. With a short context window, the model may only see a small chunk of the document per query rather than the full relevant context.

For URL ingestion, paste a web URL directly in the chat prefixed with a hash — for example #https://example.com/article — and Open WebUI will fetch and embed the page content automatically. This is useful for asking questions about documentation pages, blog posts, or research papers without copying and pasting manually.

Running Open WebUI with GPU Acceleration

If you are running Ollama with NVIDIA GPU acceleration, Open WebUI benefits automatically because inference happens in Ollama — Open WebUI just sends requests and displays responses. However, if you want Open WebUI’s embedding model (used for RAG) to also run on GPU, you need to pass the GPU device to the Docker container:

# Run with NVIDIA GPU access for Open WebUI's own embedding model
docker run -d \
  -p 3000:8080 \
  --gpus all \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:cuda

Note the :cuda tag instead of :main — this variant includes CUDA libraries for GPU-accelerated embedding. For most users the :main tag is fine since embedding is fast even on CPU, but for heavy document processing workloads the CUDA variant is meaningfully faster.

Multi-User Setup and Access Control

Open WebUI supports multiple user accounts with role-based access — admins can manage models and settings, while regular users can only chat. This makes it practical as a shared local LLM server for a small team or household. The first account created automatically becomes the admin. Additional users register via the sign-up page, and the admin can approve, suspend, or delete accounts from the Admin Panel.

For a team setup on a local network, the workflow is: run Open WebUI on the most powerful machine (the one with the GPU), find that machine’s local IP, share the URL with team members, and have each person create an account. Everyone shares the same Ollama models but has their own conversation history. The admin controls which models are visible to users and can set default parameters per model from the Admin Panel.

Open WebUI vs Other Ollama Frontends

Open WebUI is the most feature-complete local frontend, but it is not the only option. LM Studio has a built-in chat interface that is simpler and requires no Docker but only works with models loaded through LM Studio itself. Ollama’s own Web interface (if you install the companion app on macOS) is minimal — just a basic chat window with no history or document support. AnythingLLM is another full-featured alternative that adds more sophisticated RAG pipelines and agent support, at the cost of a more complex setup. For most users who want a ChatGPT-equivalent experience connected to Ollama, Open WebUI is the right choice — it has the best balance of features, active development, and ease of installation. For users who specifically need advanced agent workflows or enterprise features, AnythingLLM is worth evaluating as a complement or alternative.

Leave a Comment