How to Use Local AI with Obsidian: Smart Notes Without the Cloud

Obsidian is a popular note-taking app built around a local folder of Markdown files. Its plugin ecosystem includes several integrations with local AI — particularly Ollama — that add AI-powered features without any cloud dependency. This guide covers the most useful Obsidian AI plugins, how to connect them to Ollama, and practical workflows for AI-assisted note-taking.

Why Local AI and Obsidian Are a Natural Fit

Obsidian stores notes as plain Markdown files on your machine. This aligns naturally with local AI: your notes are already private, stored locally, and do not sync to any cloud service unless you choose. Adding a cloud AI assistant to Obsidian would undermine this privacy advantage — every note you process with AI would be sent to a third-party server. Local AI via Ollama keeps the entire workflow on your machine: notes, embeddings, summaries, and the AI model itself all live in your local file system and VRAM.

Ollama for Obsidian (Plugin)

The most direct integration is the “Ollama” community plugin for Obsidian. Install it via Settings → Community Plugins → Browse, search “Ollama”, and install. Configuration is minimal: set the Ollama host (default http://localhost:11434) and choose your model. The plugin adds a command palette option to send the current note’s content to the model, with the response inserted inline. Use cases: summarise the active note, extract action items from meeting notes, ask a question about the note’s content, rewrite a section in a different style.

Smart Connections Plugin

Smart Connections is the most capable AI plugin for Obsidian. It embeds all your notes using a local embedding model and builds a semantic search index over your entire vault. Features include: a sidebar that shows notes semantically related to the one you are reading, a chat interface that answers questions using your notes as context (RAG over your vault), and automatic link suggestions based on semantic similarity. Configure it to use Ollama’s nomic-embed-text model for embeddings by setting the embedding model to Ollama in plugin settings and specifying http://localhost:11434 as the API endpoint.

# Smart Connections Settings
Embedding Model: Ollama (nomic-embed-text)
Chat Model: Ollama (llama3.2)
API Base URL: http://localhost:11434
# Then run: "Smart Connections: Refresh All"
# This indexes your entire vault — may take minutes for large vaults

Text Generator Plugin

Text Generator is a versatile AI writing plugin that works with Ollama’s OpenAI-compatible endpoint. It supports prompt templates, inline generation, batch processing, and custom commands. Configure it by setting the API to OpenAI-compatible, base URL to http://localhost:11434/v1, model name to your Ollama model, and API key to any non-empty string (Ollama ignores it). Useful for: generating first drafts from outlines, completing unfinished sentences, expanding bullet points into paragraphs, and applying prompt templates to notes.

Practical Workflow: Smart Meeting Notes

A workflow that combines Obsidian’s note structure with Ollama summarisation: record meeting notes in a standard template, then use a Templater plugin command that calls the Ollama plugin to extract action items, decisions, and key points from the raw notes. The processed output is inserted into a separate section of the note automatically, giving you structured meeting minutes from rough notes in seconds.

<!-- Meeting note template (Templater) -->
## Raw Notes
<!-- Paste or type notes here -->

## AI Summary
<!-- Cursor here, then run: Ollama: Summarise note -->

## Action Items
<!-- Run: Ollama: Extract action items -->

Building a Personal Knowledge Base with RAG

With Smart Connections indexing your vault and Ollama serving both embeddings and chat, you have a fully local personal knowledge assistant. Ask it questions like “What did I write about project X last month?” or “Summarise everything in my vault about topic Y” and it retrieves relevant notes, synthesises the information, and answers in your chat interface — with citations pointing to the source notes. This turns your Obsidian vault from a collection of notes you have to manually search into a queryable knowledge base you can converse with.

Hardware Recommendations

For Obsidian AI workflows, a 7–8B model (llama3.2 8B, Qwen2.5 7B) is usually sufficient — most note summarisation and Q&A tasks do not require larger models. nomic-embed-text is the recommended embedding model: 768 dimensions, good semantic quality, fast to run even on CPU. Initial vault indexing with Smart Connections can be slow on CPU for large vaults (1,000+ notes) — plan to run it once overnight if your vault is large. Subsequent updates are incremental and fast. The daily use memory footprint is modest: a 7B model at Q4_K_M (~4GB VRAM) plus the embedding model is the main resource requirement.

Getting Started

Start with the Ollama plugin for immediate note summarisation, then add Smart Connections if you want semantic search and vault-wide Q&A. Both plugins require only a running Ollama instance and the appropriate models pulled — ollama pull llama3.2 and ollama pull nomic-embed-text. The combination transforms Obsidian from a static note collection into a dynamic knowledge system where your accumulated writing becomes a resource you can query and build on, with all processing staying local on your machine.

The Smart Connections Embedding Workflow

When you first enable Smart Connections with local embeddings, it processes every note in your vault — sending each note’s content to nomic-embed-text running locally via Ollama and storing the resulting 768-dimensional vector alongside the note. This initial indexing pass takes roughly 1–3 seconds per note depending on note length and your hardware. For a vault of 500 notes, expect 10–30 minutes for the initial index build. After that, only modified and new notes are re-indexed, making daily updates nearly instantaneous.

The resulting semantic search capability is qualitatively different from Obsidian’s built-in text search. Text search finds notes containing specific words. Semantic search finds notes related to a concept even when they use different vocabulary. Searching “project deadline pressure” finds your notes about stress and time management even if they never use those exact words. For large vaults where you have forgotten exactly what you wrote or how you phrased it, semantic search dramatically improves recall.

Vault Q&A in Practice

The chat interface in Smart Connections lets you ask questions answered from your own notes rather than from the model’s training data. This is particularly useful for personal knowledge management: “What did I conclude about X when I researched it last year?”, “What are all the action items I have marked as pending across my project notes?”, “Summarise everything I have written about topic Y”. The model retrieves the most semantically relevant notes, synthesises the content, and answers — with links back to the source notes so you can navigate directly to the original context.

The quality of answers depends on the quality of your notes. Well-structured notes with clear headings and explicit conclusions produce much better RAG answers than stream-of-consciousness writing. If you find the Q&A quality disappointing, improving how you structure your notes (adding explicit summaries, conclusions, and key points) often helps more than changing the AI model.

Privacy Implications of AI Plugins

Before installing any Obsidian AI plugin, verify that it uses local inference rather than cloud APIs. Some plugins that appear to support local AI actually send data to cloud endpoints by default with local models as an opt-in alternative. With the Ollama plugin and Smart Connections configured for local models, no note content leaves your machine. Verify this by checking your network traffic with a tool like Little Snitch (macOS) or Wireshark if you have strict privacy requirements. The configuration endpoints to verify: embedding model should point to localhost, chat model should point to localhost, and no telemetry or usage data should be sent to external servers. Both recommended plugins in this article support fully local operation when configured as described.

Combining Obsidian with Other Local AI Tools

Obsidian’s local Markdown vault integrates naturally with other local AI tools beyond the plugins. Your notes are plain text files that any script can read — a Python script using Ollama for batch processing can read Markdown files from the vault directory, process them, and write results back as new notes. The Whisper-based audio transcription pipeline from this blog’s earlier article can transcribe meeting recordings and write the transcript as a new Obsidian note automatically. A local RAG pipeline using your vault as the document corpus can serve as an external knowledge base for other applications. Obsidian’s design philosophy — plain files, local storage, open format — makes it the ideal hub for a local AI workflow that spans multiple tools and scripts, all operating on the same underlying note files without any cloud dependency.

Limitations and Workarounds

The main limitation of Obsidian AI plugins is latency — a local 7B model takes 5–20 seconds to generate a summary, which feels slow compared to the near-instant responses of cloud-based writing assistants. For interactive use during active writing sessions, this latency can interrupt flow. The practical workaround is to use AI assistance as an asynchronous step rather than a synchronous one: finish writing a note, then trigger the summarisation or Q&A as a background task while you move to the next note. Treating AI assistance as a processing step rather than a real-time collaborator matches the actual performance characteristics of local inference and leads to a workflow that feels natural rather than frustrating. For the Smart Connections semantic search, results are near-instantaneous after indexing — only the chat generation step has significant latency.

The second limitation is context window size for vault-wide Q&A. Smart Connections retrieves the most relevant note chunks but cannot include your entire vault in a single prompt — the retrieved context is typically 5–10 note snippets. For questions that require synthesising information spread across many notes, the answers may be incomplete. Larger context window models (Mistral Nemo 12B with 32K context, or Llama 3.3 70B with 128K) improve this but do not eliminate it for very large vaults. Regular vault hygiene — consolidating related notes, writing explicit summaries, and maintaining an index note for important topics — helps the retrieval system find the right context more reliably than relying solely on model size.

Getting Started Today

Pull two models — ollama pull llama3.2 and ollama pull nomic-embed-text — then install the Ollama plugin and Smart Connections from Obsidian’s community plugin browser. Configure both to point at http://localhost:11434 and trigger your first note summarisation. The initial Smart Connections vault index takes time for large vaults, but once complete the semantic search and vault chat are immediately usable. Start by asking it questions about topics you have written about extensively — the answers reveal both the capability of the system and areas where your notes could be better structured for retrieval. This feedback loop between AI-assisted retrieval and note quality improvement is one of the more valuable side effects of adding local AI to an Obsidian workflow.

The combination of Obsidian’s local-first design philosophy and Ollama’s local inference capability creates something genuinely useful: a personal knowledge system where AI enhances rather than replaces your thinking, all running on your own hardware with your data staying exactly where it belongs — on your own machine, under your own control, serving your thinking without any compromise on privacy or ongoing cost.