Open WebUI: Features, Settings, and Admin Guide

Open WebUI is the most fully-featured chat interface available for local LLMs, but most users only scratch the surface of what it can do. Beyond the basic chat interface, it includes multi-user management with role-based access control, custom system prompts per model, document collections (RAG), web search integration, function calling, model pipeline customisation, and an API that mirrors OpenAI’s. This guide covers the features and settings that make Open WebUI genuinely powerful for both personal and team use.

Installation Quick Recap

# With Ollama running locally
docker run -d -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
  --name open-webui \
  ghcr.io/open-webui/open-webui:main

# Access at http://localhost:3000
# First account created becomes admin

User Management and Roles

Open WebUI supports multi-user deployments with three roles: Admin, User, and Pending. The admin account created on first launch has full access to all settings and can create, modify, and delete other users. Regular users have access to chat and document features. Pending accounts are created when sign-up is enabled but require admin approval before they can log in.

To manage users: navigate to Admin Panel (top-right menu) → Users. From here you can invite users by email, change roles, reset passwords, and enable or disable sign-up. For a team deployment where you want everyone to access the same Ollama server, enable sign-up, share the URL, and set new accounts to require approval — this prevents unauthorised access while allowing easy onboarding.

Model Configuration and System Prompts

Each model in Open WebUI can be configured with a custom system prompt, default parameters, and a description. Navigate to Admin Panel → Models. Select any model from the list to edit its configuration. The system prompt set here becomes the default for all new conversations with that model — useful for giving models specific personas or instructions that match their intended use within your team.

You can also create custom models — essentially saved configurations that combine a base Ollama model with a specific system prompt, parameter set, and avatar. This is similar to Ollama’s Modelfile system but managed through the UI. Create a “Customer Support Agent” custom model with a specific system prompt, a “Code Reviewer” model with different instructions, and a “Writing Assistant” model, all backed by the same Llama 3.2 base model but with distinct configured behaviours.

Document Collections (RAG)

Open WebUI’s RAG feature lets you upload documents and ask questions about them in any conversation. Navigate to Workspace → Documents to manage your document library. Upload PDFs, text files, or Markdown documents. In a conversation, type # followed by the document name to attach it as context for that message.

For team knowledge bases, the document library is shared across all users (in the default configuration). All team members can upload documents and access them in conversations. For organisation-wide document search — company policies, technical documentation, product specs — this creates a lightweight internal knowledge base accessible through the familiar chat interface without any additional RAG infrastructure setup.

Web Search Integration

Open WebUI supports real-time web search in conversations. Configure it in Admin Panel → Settings → Web Search. Choose a search provider (SearXNG for fully local/private search, or DuckDuckGo, Google, Bing with API keys for higher quality). Once configured, toggle web search on in a conversation to have the model retrieve current information before responding. This is useful for questions about recent events, current prices, or any topic where the model’s training data may be outdated.

Conversation Features

Several conversation features are not obvious from the interface. Branching: hover over any assistant message and click the branch icon to fork the conversation from that point — useful for exploring different follow-up directions without losing the original thread. Message regeneration: click the regenerate icon on any response to get an alternative answer using the same prompt. Edit messages: click the edit icon on any user message to modify it and regenerate the response, which effectively lets you correct a prompt without starting a new conversation. Copy as Markdown: the copy button on responses outputs clean Markdown suitable for pasting into notes or documents. Share conversations: the share button generates a public link to a conversation (admin must enable sharing in settings).

Model Arena: Side-by-Side Comparison

Open WebUI includes an Arena mode where you can run the same prompt against two models simultaneously and compare outputs. Navigate to the model selector at the top of a new chat and select Arena. Choose two models, send your prompt, and both responses appear side by side. You can vote on which response was better — Open WebUI records these votes and displays aggregate quality rankings across your team’s comparisons, providing a lightweight model evaluation system built into your normal workflow.

Functions and Pipelines

Open WebUI’s Functions feature allows adding custom Python code that runs as part of the conversation pipeline. Functions can be tools (available to the model for function calling), filters (pre/post-processing of messages), or pipes (custom model implementations). Navigate to Workspace → Functions to add functions from the community library or write your own. This is the most powerful extensibility feature — it enables integrations like calendar lookup, database queries, API calls, and custom model routing without modifying the Open WebUI codebase.

API Access

Open WebUI exposes an OpenAI-compatible API at http://localhost:3000/api. This means any tool that targets the OpenAI API can be pointed at Open WebUI instead, with access to all your configured models, RAG collections, and custom pipelines. Generate an API key in User Settings → Account → API Keys. Use it exactly like an OpenAI API key but pointed at your local endpoint:

from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:3000/api',
    api_key='your-open-webui-api-key'
)

response = client.chat.completions.create(
    model='llama3.2',
    messages=[{'role':'user','content':'Hello'}]
)
print(response.choices[0].message.content)

Key Settings to Configure

A few settings in Admin Panel → Settings are worth reviewing when setting up for team use. Default Models: set the default model that appears when users start a new chat, so team members start with the recommended model rather than whatever happens to be first alphabetically. Sign-Up: control whether new users can self-register or require an invitation. Community Sharing: disable sharing of conversations externally if your team uses Open WebUI for internal-only content. Ollama API: verify the Ollama endpoint is correctly configured — if Ollama runs on a different host, update this to point at the correct URL. Image Generation: connect to a local Stable Diffusion or AUTOMATIC1111 instance to enable image generation within chats.

Open WebUI vs Other Local Chat Interfaces

The local LLM chat interface space has several options — Jan, LM Studio, Msty, and others — but Open WebUI occupies a distinct position as the most feature-complete option for team deployments. Jan and LM Studio are desktop-first, single-user applications; they are simpler and easier to set up for personal use but do not support multiple users, shared document libraries, or the extensibility that Open WebUI provides through functions and pipelines. Msty adds multi-model comparison (similar to Open WebUI’s Arena feature) but lacks the team management and RAG capabilities. For an individual developer who wants a quick local chat interface, Jan or LM Studio are simpler. For a small team that wants a shared, managed AI interface with access controls and a knowledge base, Open WebUI has no real competition among open-source options.

Performance Considerations

Open WebUI runs as a Docker container and communicates with Ollama over HTTP. The interface itself is lightweight — a modern web app with a Python FastAPI backend — and adds negligible overhead to Ollama’s inference. The main performance-relevant setting is the connection between Open WebUI and Ollama. On the same machine, use the host.docker.internal hostname to keep traffic local. On a separate server, ensure the network path has sufficient bandwidth — streaming a response at 50 tokens/sec generates approximately 200 bytes/sec of HTTP traffic, which is trivial on any local network but relevant to consider on slower connections.

For team deployments with many simultaneous users, the Ollama side (not Open WebUI) is the bottleneck. Each user generating a response consumes GPU compute for the duration of that response. If multiple users send requests simultaneously, Ollama queues them. For teams of 5–10 developers doing occasional queries during the workday, a single GPU running Ollama handles the load without issue — peak simultaneous usage is rarely more than 2–3 users. For larger deployments or teams with intensive usage patterns, configure OLLAMA_NUM_PARALLEL and scale the Ollama backend accordingly.

Keeping Open WebUI Updated

Open WebUI releases updates frequently with new features and bug fixes. Because it runs in Docker, updating is clean and does not affect stored data (conversation history, documents, user accounts) which is in the named volume:

# Pull latest image
docker pull ghcr.io/open-webui/open-webui:main

# Stop and remove the old container
docker stop open-webui && docker rm open-webui

# Start with the new image (same run command as initial setup)
docker run -d -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
  --name open-webui \
  ghcr.io/open-webui/open-webui:main

The open-webui named volume persists through container removal and recreation, so all user data survives the update. This update process takes about two minutes and is non-destructive, making it practical to keep Open WebUI current without risk to stored data.

Making the Most of Open WebUI in Daily Use

A few habits make Open WebUI significantly more useful day-to-day. First, create a custom model configuration for each distinct use case you have — a research model with a system prompt focused on summarisation and citation, a coding model with a system prompt focused on concise technical responses, a writing model with a prompt focused on tone and structure. Switching between configured models is faster than re-explaining context in every conversation. Second, build your document library proactively — upload documentation, internal guides, and reference materials as you encounter them rather than when you urgently need them. The #document retrieval is only useful when the relevant document is already there. Third, use conversation folders to organise ongoing projects — Open WebUI supports folder organisation in the sidebar, making it practical to maintain separate conversation histories for different projects or work areas without them blending together.

The Bigger Picture

Open WebUI has become the de facto standard interface for local LLM deployments that need more than a single-user chat window. Its combination of multi-user management, document RAG, model configuration, extensibility through functions, and an OpenAI-compatible API makes it the only open-source tool that covers the full spectrum from individual developer use to small-team deployment. The active development pace — with meaningful new features shipping regularly — means that capabilities that required custom development a year ago are now configuration options. For anyone running Ollama for more than personal experimentation, Open WebUI is the natural and most capable interface layer to run alongside it. Installing it takes ten minutes; the return on that investment compounds over time as your document library grows, your model configurations mature, and your team develops workflows built around a shared, persistent AI workspace rather than isolated individual sessions.

The community around Open WebUI has also produced a substantial library of contributed functions, tools, and pipeline integrations — from Jira and Notion connectors to custom voice interfaces and automated workflows. Browsing the community repository periodically surfaces integrations that would otherwise require custom development, making it worth checking whenever you encounter a workflow you wish Open WebUI could handle natively.

Leave a Comment