How to Use Ollama with Deno

Introduction

Deno is a modern JavaScript and TypeScript runtime built by the original creator of Node.js. It ships with TypeScript support out of the box, a secure-by-default permissions model, a built-in standard library, and a native HTTP server — all without needing a package.json or a separate build step. These qualities make Deno an attractive choice for writing backend services and scripts that talk to Ollama: you get a clean, minimal setup with strong defaults and no dependency management overhead.

In this guide you will build a Deno HTTP server that proxies requests to a locally-running Ollama instance, streams responses back to the client, and serves a simple HTML chat interface. Everything runs locally with no external services.

Prerequisites

Deno — Install from deno.com or with curl -fsSL https://deno.land/install.sh | sh. Check the version with deno --version.
Ollama — Install from ollama.com and pull a model: ollama pull llama3.2.

Start Ollama with ollama serve and verify it responds at http://localhost:11434 before continuing.

Calling Ollama Directly from Deno

Deno has a native fetch API that works identically to the browser’s. The simplest way to start is a short script that sends a prompt and prints the streamed response to the terminal:

// chat.ts
const res = await fetch("http://localhost:11434/api/generate", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ model: "llama3.2", prompt: "What is Deno?", stream: true }),
});

const reader = res.body!.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  const lines = decoder.decode(value).split("\n").filter(Boolean);
  for (const line of lines) {
    const data = JSON.parse(line);
    if (data.response) Deno.stdout.write(new TextEncoder().encode(data.response));
  }
}
console.log();

Run it with the network permission flag:

deno run --allow-net chat.ts

Deno’s permissions model requires you to explicitly grant network access. The --allow-net flag is the minimum needed for this script. You can restrict it further to only allow the Ollama host: --allow-net=localhost:11434.

Building an HTTP Server That Proxies Ollama

Deno’s built-in HTTP server makes it straightforward to expose Ollama to a browser. Create a file called server.ts:

Deno.serve({ port: 8000 }, async (req) => {
  const url = new URL(req.url);

  if (url.pathname === "/" && req.method === "GET") {
    const html = await Deno.readTextFile("./index.html");
    return new Response(html, { headers: { "Content-Type": "text/html" } });
  }

  if (url.pathname === "/api/chat" && req.method === "POST") {
    const body = await req.json();
    const ollamaRes = await fetch("http://localhost:11434/api/chat", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ ...body, stream: true }),
    });
    return new Response(ollamaRes.body, {
      headers: { "Content-Type": "application/x-ndjson" },
    });
  }

  return new Response("Not found", { status: 404 });
});

Run the server:

deno run --allow-net --allow-read server.ts

The --allow-read flag lets Deno read the index.html file from disk. Without it, the Deno.readTextFile call would throw a permission error.

Creating the Chat Frontend

Create index.html in the same directory as server.ts:

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8" />
  <title>Ollama + Deno</title>
  <style>
    body { font-family: sans-serif; max-width: 720px; margin: 2rem auto; padding: 0 1rem; }
    #messages { border: 1px solid #ddd; border-radius: 8px; padding: 1rem; min-height: 300px; max-height: 480px; overflow-y: auto; background: #fafafa; margin-bottom: 1rem; }
    .user { color: #1a56db; margin-bottom: 0.75rem; }
    .assistant { color: #111; margin-bottom: 0.75rem; }
    #input-row { display: flex; gap: 0.5rem; }
    input { flex: 1; padding: 0.5rem; border: 1px solid #ccc; border-radius: 4px; }
    button { padding: 0.5rem 1rem; background: #00a86b; color: white; border: none; border-radius: 4px; cursor: pointer; }
    button:disabled { background: #aaa; }
  </style>
</head>
<body>
  <h1>Ollama + Deno</h1>
  <select id="model">
    <option value="llama3.2">llama3.2</option>
    <option value="mistral">mistral</option>
  </select>
  <div id="messages"></div>
  <div id="input-row">
    <input id="prompt" placeholder="Type a message..." />
    <button id="send">Send</button>
  </div>
  <script>
    const messagesEl = document.getElementById('messages');
    const promptEl = document.getElementById('prompt');
    const sendBtn = document.getElementById('send');
    const modelEl = document.getElementById('model');
    let history = [];

    async function send() {
      const text = promptEl.value.trim();
      if (!text) return;
      history.push({ role: 'user', content: text });
      promptEl.value = '';
      sendBtn.disabled = true;
      render();

      const assistantMsg = { role: 'assistant', content: '' };
      history.push(assistantMsg);

      const res = await fetch('/api/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ model: modelEl.value, messages: history.slice(0, -1) })
      });

      const reader = res.body.getReader();
      const decoder = new TextDecoder();
      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        for (const line of decoder.decode(value).split('\n').filter(Boolean)) {
          try {
            const data = JSON.parse(line);
            if (data.message?.content) { assistantMsg.content += data.message.content; render(); }
          } catch {}
        }
      }
      sendBtn.disabled = false;
    }

    function render() {
      messagesEl.innerHTML = history.map(m =>
        '<div class="' + m.role + '"><strong>' + m.role + ':</strong> ' + m.content + '</div>'
      ).join('');
      messagesEl.scrollTop = messagesEl.scrollHeight;
    }

    sendBtn.addEventListener('click', send);
    promptEl.addEventListener('keydown', e => e.key === 'Enter' && send());
  </script>
</body>
</html>

Visit http://localhost:8000 and you have a working chat interface backed by Ollama, served entirely by a Deno HTTP server with no npm packages involved.

Using Deno’s Standard Library for a Cleaner Server

For a more structured approach, Deno’s standard library includes a file server utility. Import it directly from the JSR registry:

import { serveDir } from "jsr:@std/http/file-server";

Deno.serve({ port: 8000 }, async (req) => {
  const url = new URL(req.url);
  if (url.pathname === "/api/chat" && req.method === "POST") {
    const body = await req.json();
    const res = await fetch("http://localhost:11434/api/chat", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ ...body, stream: true }),
    });
    return new Response(res.body, { headers: { "Content-Type": "application/x-ndjson" } });
  }
  return serveDir(req, { fsRoot: ".", quiet: true });
});

serveDir serves all static files from the current directory automatically. Deno resolves the import from JSR on first run and caches it locally — no install step required.

Writing a CLI Chat Script with Deno

Deno is also excellent for terminal tools. Here is a simple interactive REPL that lets you chat with Ollama from the command line:

// repl.ts
const model = Deno.args[0] || "llama3.2";
const messages: { role: string; content: string }[] = [];
const encoder = new TextEncoder();

console.log("Chatting with " + model + ". Press Ctrl+C to quit.\n");

while (true) {
  Deno.stdout.write(encoder.encode("You: "));
  const buf = new Uint8Array(4096);
  const n = await Deno.stdin.read(buf);
  if (n === null) break;
  const input = new TextDecoder().decode(buf.subarray(0, n)).trim();
  if (!input) continue;

  messages.push({ role: "user", content: input });

  const res = await fetch("http://localhost:11434/api/chat", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ model, messages, stream: true }),
  });

  Deno.stdout.write(encoder.encode("Assistant: "));
  let assistantContent = "";
  const reader = res.body!.getReader();
  const decoder = new TextDecoder();
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    for (const line of decoder.decode(value).split("\n").filter(Boolean)) {
      const data = JSON.parse(line);
      if (data.message?.content) {
        Deno.stdout.write(encoder.encode(data.message.content));
        assistantContent += data.message.content;
      }
    }
  }
  messages.push({ role: "assistant", content: assistantContent });
  console.log("\n");
}

Run it with: deno run --allow-net --allow-read repl.ts mistral. This gives you a fast, dependency-free CLI chat tool in under 50 lines of TypeScript.

Deploying a Deno Ollama App

For a long-running local deployment, you can run the Deno server as a background process using nohup or a process manager like pm2. To compile the server to a self-contained executable for easier distribution:

deno compile --allow-net --allow-read --output ollama-server server.ts

This produces a single binary that includes the Deno runtime and your server code. You can copy it to any machine with Ollama installed and run it without installing Deno separately. The compiled binary respects the same permission flags that were baked in at compile time, so users cannot expand its permissions at runtime — a useful security property for tools you share with teammates.

Error Handling and Timeouts

Robust error handling is important when working with local LLMs, which can be slow to start or occasionally unresponsive. Add a timeout to your Ollama fetch calls using AbortSignal.timeout:

const ollamaRes = await fetch("http://localhost:11434/api/chat", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ ...body, stream: true }),
  signal: AbortSignal.timeout(120_000),
}).catch(err => { throw new Error("Ollama unreachable: " + err.message); });

In your server handler, catch errors and return a JSON error response so the frontend can display a useful message rather than hanging indefinitely:

if (url.pathname === "/api/chat" && req.method === "POST") {
  try {
    const body = await req.json();
    const ollamaRes = await fetch("http://localhost:11434/api/chat", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ ...body, stream: true }),
      signal: AbortSignal.timeout(120_000),
    });
    return new Response(ollamaRes.body, { headers: { "Content-Type": "application/x-ndjson" } });
  } catch (e) {
    return Response.json({ error: e.message }, { status: 503 });
  }
}

These additions make the server resilient to common failure modes — Ollama not started, model still loading, or a generation that takes longer than expected — without requiring any external dependencies or middleware libraries.

Conclusion

Deno’s built-in TypeScript support, native fetch API, secure permissions model, and zero-install imports make it an ideal runtime for lightweight Ollama integrations. Whether you need a quick CLI script, a proxying HTTP server, or a compiled standalone binary, Deno handles all of these with minimal setup and no node_modules directory in sight. The permissions model also gives you a clear audit trail of exactly what network and file system access your tooling requires — a useful property when building tools you plan to share or run on shared machines. From this foundation you can add JSR packages for routing, templating, or database access to build out a more complete application, all while keeping the clean, explicit dependency style that makes Deno projects easy to understand and maintain.

Structuring Larger Deno Projects

For a simple Ollama proxy script, a single file is all you need. But as your project grows — adding multiple endpoints, configuration options, or a more sophisticated frontend — it is worth adopting a light structure. A typical Deno project for an Ollama-backed tool might look like this:

ollama-deno/
├── server.ts          # Entry point, route dispatch
├── handlers/
│   ├── chat.ts        # POST /api/chat handler
│   └── models.ts      # GET /api/models handler
├── static/
│   ├── index.html
│   └── app.js
└── deno.json          # Task runner and import map

The deno.json file replaces package.json for task definitions and import maps. Define your run command there so you do not have to remember the permission flags:

{
  "tasks": {
    "start": "deno run --allow-net --allow-read server.ts",
    "dev": "deno run --watch --allow-net --allow-read server.ts"
  }
}

The --watch flag in the dev task restarts the server automatically whenever a source file changes, giving you the same hot-reload experience you would expect from a Node.js dev server. Run the dev server with deno task dev and the production server with deno task start. This lightweight structure keeps the project easy to navigate while giving each handler its own file as complexity grows.