Msty: The Local LLM App That Lets You Compare Models Side by Side

Msty is a local LLM desktop application that takes a different approach from Ollama, LM Studio, and Jan. Rather than focusing on a single chat interface, Msty is built around the concept of parallel conversations — you can run the same prompt against multiple models simultaneously and compare the outputs side by side. It also supports branching conversations, conversation folders, and a knowledge layer that lets you attach documents and web content to any conversation. This guide covers what Msty does well and how to get it set up.

Installation

Download Msty from msty.app — it is free for local model use and available for macOS, Windows, and Linux. The installer is self-contained. Msty bundles its own inference backend, so you do not strictly need Ollama installed, but it also supports connecting to an existing Ollama instance if you already have models downloaded there.

Connecting to Ollama

If you already use Ollama, connect Msty to it rather than re-downloading models. Open Settings → Model Providers → Ollama, enter http://localhost:11434 as the endpoint, and click Connect. All your existing Ollama models appear immediately in Msty’s model selector. This is the fastest way to get started — your existing model library is available in Msty with no additional downloads.

The Parallel Comparison Feature

Msty’s most distinctive feature is multi-model comparison. In any conversation, click the + icon next to the model selector to add a second (or third) model to the chat. When you send a message, Msty runs it against all selected models simultaneously and displays the responses side by side. This is genuinely useful for a few specific scenarios: deciding which model to use for a task by seeing their outputs on a real example, comparing a local model against a cloud model to assess quality, and catching cases where models disagree on factual questions (a useful signal that the question is uncertain or contested).

Knowledge: Attaching Documents to Conversations

Msty calls its RAG feature “Knowledge”. You can create Knowledge collections — groups of documents, URLs, or text notes — and attach them to any conversation. The collection is embedded and stored locally, and Msty retrieves relevant chunks when you ask questions. Unlike AnythingLLM’s workspace model, Msty’s Knowledge is attached per-conversation rather than being a shared resource — each conversation can have its own document context, which makes it easy to work on multiple projects simultaneously without worrying about document sets bleeding across conversations.

To add a Knowledge collection: click the book icon in the conversation sidebar, create a new collection, upload files or paste URLs, and wait for embedding to complete. The collection then appears as an option when you start or continue any conversation. You can reuse the same collection across multiple conversations without re-uploading.

Conversation Branching

Msty supports conversation branching — from any point in a conversation you can fork a new branch to explore a different direction without losing the original thread. This is useful when you want to try different approaches to a problem (ask the model to take different tones, explore different solutions) and compare the paths side by side. Each branch is visible in the conversation tree on the left sidebar.

Prompts Library

Msty has a built-in prompts library where you can save and organise system prompts and message templates. This is similar to Ollama’s Modelfile system prompt but more lightweight — you save a prompt in the library, give it a name, and can apply it to any new conversation with one click rather than creating a named model. For users who work with many different task types (writing, coding, analysis, summarisation) and want a different system prompt for each, the prompts library is a convenient alternative to managing multiple Modelfiles.

Cloud Model Integration

Unlike purely local tools, Msty supports connecting to cloud LLM providers alongside local models. You can add OpenAI, Anthropic, Google Gemini, Groq, and others by entering API keys in Settings → Model Providers. This means the parallel comparison feature works across local and cloud models — you can run a prompt against Llama 3.2 8B locally and Claude Sonnet in the same window, which is a practical way to calibrate how much quality difference the cloud model provides for your specific use cases.

Msty vs Other Local LLM Apps

Msty occupies a specific niche in the local LLM landscape. It is more opinionated than Ollama (GUI-first, not scriptable) and less technically configurable than LM Studio (no detailed GPU layer controls). Its unique strengths are the multi-model comparison and conversation branching features, which no other mainstream local LLM app offers as smoothly. The Knowledge system is solid but less powerful than AnythingLLM’s workspace model for large document collections. Jan is simpler and more privacy-focused. Open WebUI has better multi-user support.

The user Msty is best for is someone who regularly wants to compare model outputs — either to evaluate models, to explore different approaches to a problem, or to use local and cloud models together in a single workflow. If you primarily use a single model and just want a clean chat interface, Jan or Open WebUI are simpler choices. If model comparison and conversation organisation are important to your workflow, Msty is the most polished tool available for those specific use cases.

Performance and Resource Usage

When running parallel comparisons with multiple models simultaneously, Msty’s resource usage multiplies accordingly — two models generating at the same time use twice the VRAM and produce responses at half the speed of a single model. On 8GB VRAM, running two 7B models in parallel is tight and may cause one or both to fall back to CPU. For parallel comparisons on constrained hardware, use smaller models (3B range) or compare a local model against a cloud model (cloud inference uses no local VRAM). The comparison feature is most practical on machines with 16GB+ VRAM or Apple Silicon with 32GB+ unified memory, where two 7B models can comfortably run simultaneously.

Practical Workflows Where Msty Shines

A few specific workflows show Msty’s comparison feature at its best. The first is model evaluation for a new task. When you are deciding which model to use for a new type of work — say you want to start using a local LLM for drafting client emails — run five representative examples through Llama 3.2 8B, Qwen2.5 7B, and Mistral 7B simultaneously. In twenty minutes you have side-by-side outputs for all your test cases and can make an informed choice rather than guessing based on benchmarks that may not reflect your specific use case. This is faster and more informative than running the models sequentially and trying to remember how each one performed.

The second workflow is calibrating local vs cloud quality. If you are considering switching from a cloud API to a local model for a production application, run a week’s worth of representative queries through both and compare outputs in Msty. This gives you a concrete, task-specific quality comparison rather than relying on abstract benchmark scores. The cases where the local model falls short are often identifiable patterns — a specific type of reasoning, a particular output format, an edge case in your domain — that can be addressed with prompt engineering or a targeted Modelfile before making the switch.

The third workflow is using Msty’s branching to explore solution approaches. When solving a complex problem — debugging a subtle issue, designing a system architecture, writing a difficult piece of communication — fork the conversation at the point where the approach diverges and explore two or three paths simultaneously. This is more structured than just running multiple separate chats because the shared context up to the branch point is preserved in both branches, and you can switch between them to compare progress without losing context.

Setting Up Your First Knowledge Collection

Knowledge collections in Msty work best when you match the scope of the collection to the scope of questions you will ask. A collection containing all your company’s documentation will give worse retrieval precision than a collection containing just the documentation relevant to your current project, because the retrieval model has to distinguish relevant from irrelevant content within the collection. Start with small, focused collections — one collection per project or topic — and only consolidate into larger collections if you find yourself frequently needing to query across them.

For document formats, Msty handles PDFs, text files, Markdown, and web URLs well. For long PDFs with dense technical content (manuals, research papers), consider splitting them into logical sections before uploading — Msty’s automatic chunking may split the document at awkward points that break the logical structure. For web content that updates frequently, re-adding the URL refreshes the cached content with the current page version. The collection shows the last-updated timestamp for each source, making it easy to identify stale content that needs refreshing.

Keyboard Shortcuts Worth Learning

Msty has a good set of keyboard shortcuts that speed up the workflows described above. The most useful are: Cmd+K (Mac) or Ctrl+K (Windows) to open the command palette and quickly switch between conversations or models; Cmd+Enter to send a message; Cmd+Shift+B to branch the current conversation; and Cmd+P to open the prompts library and apply a saved system prompt. Spending five minutes learning these shortcuts makes the multi-model comparison workflow significantly faster than using only the mouse interface — the ability to quickly fork a conversation, apply a different system prompt, and run the comparison again is where Msty’s interface really clicks once you have the shortcuts internalised.

Getting Started: The Fastest Path

If you already have Ollama installed, the fastest way to try Msty is: download and install the app, connect it to Ollama in settings, open a new conversation, add a second model using the + icon, and type a prompt you normally use. Within two minutes you will have a side-by-side comparison of two models on a real task from your own workflow — which is a much more informative first experience than any benchmark score or review article. Most users either immediately see the comparison value and start using Msty regularly, or decide their use case does not need it and go back to a simpler tool. Either outcome is useful information. The download is small and the setup is reversible, so there is little cost to trying it alongside whatever local LLM tool you are currently using.