How to Use Ollama with Jupyter Notebook

Jupyter Notebook is the standard environment for data science, machine learning research, and exploratory Python work. Connecting it to Ollama gives you an interactive local AI assistant that understands your code and data — you can ask questions about dataframes, generate visualisation code, explain statistical results, and iterate on analysis with a local LLM, all without leaving your notebook or sending data to a cloud API. This guide covers the core integration patterns: calling Ollama directly from notebook cells, building a chat widget with ipywidgets, using Ollama for code explanation and generation, and integrating it into data analysis workflows.

The notebook environment is particularly well-suited for LLM integration because cells execute incrementally — you can build up context gradually, inspect intermediate results, and refine prompts interactively in a way that scripted applications cannot match. Ollama’s local operation means there are no API rate limits or costs that would discourage this kind of iterative experimentation.

Setup

Install the required packages in your notebook environment:

pip install httpx ipywidgets
jupyter nbextension enable --py widgetsnbextension

Make sure Ollama is running with at least one model pulled. For data science work, llama3.2 is a good general-purpose choice. For code-heavy notebooks, qwen2.5-coder:7b generates more accurate Python and handles pandas, numpy, and matplotlib idioms better.

Basic Cell-Level Queries

The simplest integration is a helper function you can call from any notebook cell:

import httpx, json
from IPython.display import display, Markdown

OLLAMA_URL = "http://localhost:11434"
MODEL = "llama3.2"

def ask(prompt: str, model: str = MODEL) -> str:
    """Send a prompt to Ollama and display the response as Markdown."""
    with httpx.Client(timeout=120) as client:
        resp = client.post(
            f"{OLLAMA_URL}/api/chat",
            json={"model": model, "messages": [{"role": "user", "content": prompt}], "stream": False}
        )
        reply = resp.json()["message"]["content"]
    display(Markdown(reply))
    return reply

# Usage in any cell:
ask("Explain the difference between loc and iloc in pandas")
ask("Write a Python function to calculate rolling z-scores")

Displaying the response with display(Markdown(reply)) renders code blocks, bold text, and bullet points properly in the notebook output area — much more readable than a raw string. The function also returns the reply so you can assign it to a variable and use it programmatically in subsequent cells.

Streaming Output in Notebook Cells

For long responses, stream tokens as they arrive using a live-updating output widget:

from IPython.display import clear_output
import sys

def ask_stream(prompt: str, model: str = MODEL):
    """Stream Ollama response token by token into the cell output."""
    full = ""
    with httpx.Client(timeout=120) as client:
        with client.stream(
            "POST", f"{OLLAMA_URL}/api/chat",
            json={"model": model, "messages": [{"role": "user", "content": prompt}], "stream": True}
        ) as resp:
            for line in resp.iter_lines():
                if not line:
                    continue
                chunk = json.loads(line)
                token = chunk.get("message", {}).get("content", "")
                full += token
                clear_output(wait=True)
                display(Markdown(full))
                if chunk.get("done"):
                    break
    return full

ask_stream("Explain transformer attention mechanisms in detail")

The clear_output(wait=True) and re-rendering approach gives a smooth typewriter effect in the notebook. The wait=True parameter prevents flickering by only clearing the output immediately before the next render, rather than briefly showing a blank cell. This pattern works in both classic Jupyter Notebook and JupyterLab.

An Interactive Chat Widget

Build a persistent chat interface inside the notebook using ipywidgets:

import ipywidgets as widgets
from IPython.display import display

class NotebookChat:
    def __init__(self, model: str = MODEL, system: str = ""):
        self.model = model
        self.history = ([{"role":"system","content":system}] if system else [])

        self.out = widgets.Output(layout={"border":"1px solid #ccc","min_height":"200px","padding":"10px"})
        self.inp = widgets.Text(placeholder="Ask something...", layout={"width":"80%"})
        self.btn = widgets.Button(description="Send", button_style="primary")
        self.clr = widgets.Button(description="Clear", button_style="warning")
        self.btn.on_click(self._send)
        self.clr.on_click(self._clear)
        display(self.out, widgets.HBox([self.inp, self.btn, self.clr]))

    def _send(self, _):
        prompt = self.inp.value.strip()
        if not prompt: return
        self.inp.value = ""
        self.history.append({"role":"user","content":prompt})
        with self.out:
            display(Markdown(f"**You:** {prompt}"))
        try:
            with httpx.Client(timeout=120) as client:
                resp = client.post(
                    f"{OLLAMA_URL}/api/chat",
                    json={"model":self.model,"messages":self.history,"stream":False}
                )
            reply = resp.json()["message"]["content"]
            self.history.append({"role":"assistant","content":reply})
            with self.out:
                display(Markdown(f"**Assistant:** {reply}"))
        except Exception as e:
            with self.out:
                display(Markdown(f"**Error:** {e}"))

    def _clear(self, _):
        self.history = [h for h in self.history if h["role"]=="system"]
        self.out.clear_output()

# Launch:
chat = NotebookChat(model="llama3.2", system="You are a data science assistant.")
# Type in the input box and click Send

The widget renders inline in the notebook cell output, giving you a persistent chat panel below any cell. The conversation history accumulates across exchanges, so the model maintains context. The Clear button resets the conversation but preserves the system prompt. This widget is particularly useful during exploratory data analysis — you can ask questions about your dataset, generate code suggestions, and keep the conversation running while you work in other cells above it.

Explaining Code and DataFrames

One of the most useful Jupyter integrations is asking Ollama to explain code or DataFrame contents. Wrap these in convenience functions:

import inspect
import pandas as pd

def explain_code(func_or_code):
    """Explain a function or code string."""
    code = inspect.getsource(func_or_code) if callable(func_or_code) else func_or_code
    return ask(f"Explain what this Python code does, step by step:

```python
{code}
```")

def explain_df(df: pd.DataFrame, question: str = "Describe this dataset"):
    """Explain a DataFrame based on its structure and sample data."""
    info = f"Shape: {df.shape}
Columns: {list(df.columns)}
Dtypes:
{df.dtypes}

Sample (5 rows):
{df.head().to_string()}

Basic stats:
{df.describe().to_string()}"
    return ask(f"{question}

{info}")

def suggest_analysis(df: pd.DataFrame):
    """Get analysis suggestions for a DataFrame."""
    return explain_df(df, "What analyses would you suggest for this dataset? List 5 specific analyses with Python code.")

# Usage:
explain_code(my_preprocessing_function)
explain_df(df, "What patterns do you notice in this sales data?")
suggest_analysis(df)

The inspect.getsource call retrieves the actual source code of any Python function defined in the notebook or imported from a module, making it easy to get an explanation of any function without copying and pasting. The DataFrame explanation includes shape, dtypes, a sample of rows, and descriptive statistics — enough context for the model to give meaningful analysis suggestions without sending the entire dataset.

Generating and Executing Code

Ask Ollama to generate Python code and optionally execute it in the notebook:

import re

def generate_code(task: str, context: str = "", execute: bool = False):
    """Generate Python code for a task, optionally execute it."""
    prompt = f"Write Python code to: {task}"
    if context:
        prompt += f"

Context:
{context}"
    prompt += "

Respond with only the code, no explanation. Use markdown code fences."

    reply = ask(task, model="qwen2.5-coder:7b")

    # Extract code from markdown fences
    code_match = re.search(r"```(?:python)?\n([\s\S]*?)```", reply)
    code = code_match.group(1).strip() if code_match else reply.strip()

    if execute:
        print("=== Generated Code ===")
        print(code)
        print("=== Executing ===")
        exec(code, globals())

    return code

# Example: generate a visualisation
generate_code(
    "Create a seaborn heatmap of the correlation matrix for df",
    context=f"df has columns: {list(df.columns)}",
    execute=True
)

The execute=True flag runs the generated code in the global namespace, making any created variables and plots available in subsequent cells. Use this with caution — always review generated code before executing it, especially for data manipulation tasks where incorrect code could modify your dataset in unexpected ways. For exploration and visualisation code, auto-execution is generally safe and speeds up the workflow significantly.

A Data Analysis Assistant

Combine the pieces into a dedicated data analysis assistant that maintains context about your current dataset throughout the notebook session:

class DataAssistant:
    def __init__(self, df: pd.DataFrame, model: str = "llama3.2"):
        self.df = df
        self.model = model
        self.history = [{
            "role": "system",
            "content": f"""You are a data analysis assistant. The current dataset has:
- Shape: {df.shape}
- Columns: {list(df.columns)}
- Dtypes: {df.dtypes.to_dict()}
- Sample:\n{df.head(3).to_string()}

Provide concise, actionable analysis advice and working Python code using pandas, numpy, matplotlib, and seaborn."""
        }]

    def ask(self, question: str) -> str:
        self.history.append({"role": "user", "content": question})
        with httpx.Client(timeout=120) as client:
            resp = client.post(
                f"{OLLAMA_URL}/api/chat",
                json={"model": self.model, "messages": self.history, "stream": False}
            )
        reply = resp.json()["message"]["content"]
        self.history.append({"role": "assistant", "content": reply})
        display(Markdown(reply))
        return reply

assistant = DataAssistant(df)
assistant.ask("What are the top 3 insights from this data?")
assistant.ask("Write code to visualise the distribution of the main metric")

The system prompt embeds a snapshot of the DataFrame metadata at initialisation, giving the model persistent context about the dataset structure for the entire session. Subsequent questions can refer to column names and expect the model to generate accurate code without repeating the schema every time.

Using Ollama for Notebook Documentation

Automatically generate docstrings and markdown documentation for your notebook code. This is particularly useful when sharing notebooks with colleagues who need to understand your analysis without reading every line of code:

def document_notebook_cell(code: str) -> str:
    """Generate a markdown explanation for a notebook cell."""
    return ask(
        f"Write a brief (2-3 sentence) markdown explanation of what this code does, suitable for a Jupyter notebook markdown cell that precedes this code cell:

```python
{code}
```

Respond with only the markdown text, no code fences.",
        model="llama3.2"
    )

# Example usage: document your preprocessing pipeline
preprocessing_code = """
df = df.dropna(subset=['revenue', 'date'])
df['date'] = pd.to_datetime(df['date'])
df['month'] = df['date'].dt.to_period('M')
df['revenue_log'] = np.log1p(df['revenue'])
"""
document_notebook_cell(preprocessing_code)

Generated documentation can be copied directly into a markdown cell above the code cell. For systematic documentation of an entire notebook, loop over all code cells using nbformat and the Jupyter API, generate documentation for each, and insert markdown cells automatically. This turns an undocumented exploratory notebook into a readable report without manual writing effort.

Model Selection for Notebook Work

For general Q&A, explanation, and documentation tasks in Jupyter, llama3.2 is the right default — fast, coherent responses that render well in Markdown. For code generation tasks — writing pandas transforms, matplotlib plots, scikit-learn pipelines — switch to qwen2.5-coder:7b by passing it as the model argument. It generates significantly more accurate and idiomatic Python than general-purpose models, particularly for pandas operations and NumPy array manipulations that require precise API knowledge.

For large datasets where you are sending schema information as context, a model with a larger context window like llama3.2 (128k tokens) handles more columns and more sample rows without truncation issues. Set the default model at the top of your notebook as a constant so you can switch all function calls at once, and use the smaller code model selectively for the generate_code function where code accuracy matters most.

Using Ollama for Error Diagnosis

One of the most immediately practical uses of Ollama in a Jupyter environment is diagnosing errors. When a cell raises an exception, ask Ollama to explain the error and suggest a fix. Wrap this in an IPython magic-style helper that captures the last exception automatically:

import traceback, sys

def diagnose():
    """Explain the most recent exception."""
    exc = sys.exc_info()
    if exc[0] is None:
        print("No recent exception.")
        return
    tb = "".join(traceback.format_exception(*exc))
    return ask(f"Explain this Python error and suggest how to fix it:

{tb}")

# Usage: run in a cell after an exception
try:
    result = df.groupby("nonexistent_column").sum()
except Exception:
    diagnose()

The traceback includes the full call stack and error message, giving Ollama enough context to explain not just what went wrong but why it happened and what the fix typically looks like. For common pandas errors like KeyError on missing columns, shape mismatches, and dtype incompatibilities, the explanations are consistently accurate and actionable. This pattern is particularly valuable for less experienced Python users who find tracebacks cryptic — Ollama translates them into plain language with suggested corrections.

Integrating with nbformat for Notebook Processing

The nbformat library lets you read, modify, and write Jupyter notebook files programmatically. Combine it with Ollama to process existing notebooks — adding documentation, generating summaries, or creating teaching annotations:

import nbformat

def annotate_notebook(input_path: str, output_path: str):
    """Add AI-generated explanations to each code cell in a notebook."""
    with open(input_path) as f:
        nb = nbformat.read(f, as_version=4)

    new_cells = []
    for cell in nb.cells:
        if cell.cell_type == "code" and cell.source.strip():
            explanation = ask(
                f"Write a 1-2 sentence explanation of this code for a markdown cell:

{cell.source}",
            ).strip()
            new_cells.append(nbformat.v4.new_markdown_cell(explanation))
        new_cells.append(cell)

    nb.cells = new_cells
    with open(output_path, "w") as f:
        nbformat.write(nb, f)
    print(f"Annotated notebook saved to {output_path}")

annotate_notebook("analysis.ipynb", "analysis_annotated.ipynb")

This produces a notebook where every code cell is preceded by a plain-language explanation — useful for turning a working analysis into a teaching document, or for documenting a research notebook before sharing it with colleagues from different technical backgrounds. The annotation quality is best for self-contained code cells where the context is clear from the cell alone; cells that depend heavily on variables defined elsewhere may get less accurate explanations.

Working with JupyterLab vs Classic Notebook

The integration patterns in this guide work in both classic Jupyter Notebook (running on port 8888) and JupyterLab. The ipywidgets chat interface renders correctly in both environments, though JupyterLab may require the jupyterlab-widgets extension: run pip install jupyterlab_widgets and then jupyter labextension install @jupyter-widgets/jupyterlab-manager if the widgets do not render. The display and Markdown imports from IPython.display work identically in both environments.

For JupyterLab specifically, consider using the jupyter-ai extension as an alternative to the custom integration shown here. It provides a built-in AI chat panel, cell magic commands, and model configuration through the JupyterLab interface. As of 2026 it supports Ollama as a backend through its model configuration, giving you a polished integrated experience without writing any integration code. The custom integration in this guide is more flexible and requires no additional extension installation, but jupyter-ai is worth evaluating if you want a zero-configuration setup.

Privacy and Data Considerations

One of the primary advantages of using Ollama in Jupyter is data privacy. When you pass DataFrame contents, file paths, or business data to the LLM for analysis suggestions, that data never leaves your machine — it goes from your notebook to the local Ollama process and back. This is in stark contrast to cloud-based AI coding assistants like GitHub Copilot, which send code and context to remote servers. For notebooks containing sensitive data — financial records, medical data, customer information, or proprietary business logic — local Ollama integration means you can get AI assistance without data governance or compliance concerns.

The practical implication is that you can be more generous with the context you provide to the model. Rather than carefully sanitising your prompts to avoid sending sensitive column names or values, you can pass the full DataFrame schema and a representative sample of real data, resulting in more accurate and relevant suggestions. This is one of the less obvious but practically significant benefits of running the model locally — the privacy guarantee changes how you use the tool, not just where the data goes.

Practical Tips for Daily Use

A few habits make Ollama integration more useful in daily notebook work. Define your helper functions in a dedicated cell near the top of every notebook and run it as part of your standard notebook setup, so the functions are always available without importing from a separate file. Keep a running list of your most-used prompts as a dictionary at the top of the notebook — prompts for generating visualisations, explaining errors, suggesting tests — so you can call them by name rather than retyping the same instructions. Version your prompts alongside your code: prompts that produce consistently good results are as valuable as the code itself and should be preserved in the notebook rather than lost between sessions.

For team workflows where multiple analysts share notebooks, store the Ollama base URL in a configuration file or environment variable rather than hardcoding it in each notebook. This lets different team members point to different Ollama instances — one person running locally, another using a shared GPU server — without modifying the notebook code. A simple OLLAMA_URL = os.getenv("OLLAMA_URL", "http://localhost:11434") at the top of the helper functions cell handles this cleanly. The environment variable approach also makes it easy to switch between different Ollama endpoints for testing versus production analysis workflows.

Ollama in Jupyter is most powerful when it becomes a habit rather than a one-off experiment. Keep the helper functions in a shared file, define a keyboard shortcut to run the ask cell, and treat the LLM as a peer reviewer who is always available in the same environment where you are doing your analysis.

The cell-by-cell execution model of Jupyter makes prompt iteration fast and natural — change a word in your prompt, rerun the cell, compare the output. That tight loop is what makes Jupyter the best environment for learning how to get reliable, useful results from local LLMs.