How to Use Ollama with Deno and Bun

Ollama works in any JavaScript runtime that supports the npm ecosystem. Deno and Bun are both fast-growing alternatives to Node.js, and both run the official Ollama npm package with no configuration. If you are building with either runtime, you can integrate local LLMs with the same API as Node.js — and in many cases with faster startup times and simpler scripting.

Deno

Deno supports npm packages natively via the npm: specifier. No installation step needed:

// deno run --allow-net deno_ollama.ts
import ollama from 'npm:ollama';

// Basic chat
const response = await ollama.chat({
  model: 'llama3.2',
  messages: [{ role: 'user', content: 'What is Deno?' }]
});
console.log(response.message.content);

// Streaming
const stream = await ollama.chat({
  model: 'llama3.2',
  messages: [{ role: 'user', content: 'Count to 5 slowly.' }],
  stream: true
});
for await (const chunk of stream) {
  Deno.stdout.write(new TextEncoder().encode(chunk.message.content));
}
console.log();

The --allow-net flag is required because Deno requires explicit permission for network access. All other Ollama library features — embeddings, model management, streaming — work identically to Node.js.

Deno with the OpenAI SDK

// deno run --allow-net openai_deno.ts
import OpenAI from 'npm:openai';

const client = new OpenAI({
  baseURL: 'http://localhost:11434/v1',
  apiKey: 'ollama'
});

const response = await client.chat.completions.create({
  model: 'llama3.2',
  messages: [{ role: 'user', content: 'Hello from Deno!' }]
});
console.log(response.choices[0].message.content);

Deno Configuration File

For projects, use a deno.json to avoid repeating the npm specifier:

{
  "imports": {
    "ollama": "npm:ollama",
    "openai": "npm:openai"
  },
  "tasks": {
    "chat": "deno run --allow-net src/chat.ts",
    "embed": "deno run --allow-net src/embed.ts"
  }
}
// Now import cleanly
import ollama from 'ollama';
import OpenAI from 'openai';

Bun

Bun is a drop-in Node.js replacement with a built-in package manager, test runner, and bundler. Install the Ollama library as you would with npm:

bun add ollama
# or for OpenAI SDK
bun add openai
// bun run bun_ollama.ts
import ollama from 'ollama';

// Streaming with Bun's fast I/O
const stream = await ollama.chat({
  model: 'llama3.2',
  messages: [{ role: 'user', content: 'What makes Bun fast?' }],
  stream: true
});
for await (const chunk of stream) {
  process.stdout.write(chunk.message.content);
}
console.log();

// Embeddings
const result = await ollama.embeddings({
  model: 'nomic-embed-text',
  prompt: 'Bun is a fast JavaScript runtime'
});
console.log('Dimension:', result.embedding.length);

Bun Scripts as CLI Tools

Bun’s fast startup (sub-5ms) makes it excellent for CLI scripts that call Ollama. Because Bun compiles and runs TypeScript directly without a separate tsc step, you can write type-safe Ollama scripts with minimal overhead:

#!/usr/bin/env bun
// summarise.ts — bun run summarise.ts input.txt
import ollama from 'ollama';
import { readFileSync } from 'fs';

const [,, filePath] = process.argv;
if (!filePath) {
  console.error('Usage: bun run summarise.ts ');
  process.exit(1);
}

const content = readFileSync(filePath, 'utf-8');
const words = content.split(/\s+/);
const text = words.slice(0, 4000).join(' ');

const stream = await ollama.chat({
  model: 'llama3.2',
  messages: [{
    role: 'user',
    content: `Summarise this in 5 bullet points:\n\n${text}`
  }],
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.message.content);
}
console.log();
# Run directly
bun run summarise.ts my_document.txt

# Or compile to a single executable
bun build --compile summarise.ts --outfile summarise
./summarise my_document.txt

Bun’s Test Runner with Ollama

// ollama.test.ts
import { describe, it, expect, beforeAll } from 'bun:test';
import ollama from 'ollama';

describe('Ollama integration', () => {
  beforeAll(async () => {
    // Verify Ollama is running
    const models = await ollama.list();
    expect(models.models.length).toBeGreaterThan(0);
  });

  it('generates a response', async () => {
    const response = await ollama.chat({
      model: 'llama3.2',
      messages: [{ role: 'user', content: 'Reply with just: OK' }],
      options: { temperature: 0 }
    });
    expect(response.message.content.trim().toUpperCase()).toContain('OK');
  });

  it('generates embeddings', async () => {
    const result = await ollama.embeddings({
      model: 'nomic-embed-text',
      prompt: 'test'
    });
    expect(result.embedding).toHaveLength(768);
  });
});

// Run: bun test

Choosing Between Node.js, Deno, and Bun for Ollama Projects

For new Ollama projects, all three runtimes work equally well with the official library. Node.js has the largest ecosystem and most third-party library compatibility, making it the default for complex projects with many dependencies. Deno is the strongest choice for security-conscious projects (explicit permission grants for filesystem, network, and environment access), TypeScript-first development without build configuration, and short scripts where Deno’s built-in formatter and linter add value without overhead. Bun is the best choice for performance-sensitive CLI tools and scripts where startup time matters, or for projects where you want a single binary that includes package management, testing, and bundling without separate tooling. For most Ollama integration work the differences are minor — the library API is identical across all three, and you are likely to spend more time thinking about the LLM prompts than the runtime choice.

Why Deno and Bun Are Growing in the AI Space

The rise of Deno and Bun alongside Ollama is not coincidental — all three represent a push toward simpler, faster developer tooling that removes friction from the development loop. Deno eliminates the node_modules directory and the package.json ecosystem complexity by fetching dependencies on demand and caching them locally. Bun eliminates the distinction between package manager, test runner, bundler, and runtime by combining all four into one binary. Ollama eliminates the need for Python environments, CUDA configuration, and inference server setup by providing a single binary that runs models with one command. For developers who want to build AI-powered tools without extensive infrastructure setup, this combination of simpler runtimes and simpler inference backends is particularly appealing.

The practical consequence is that a Bun + Ollama CLI script goes from concept to working executable in minutes rather than the hour it might take to set up a comparable Python environment with virtual environments, dependency management, and packaging. For developers who primarily work in JavaScript and TypeScript, this is the path of least resistance to local AI integration — no new language, no new package manager, no new mental model for async I/O.

Deno’s Permission Model and Ollama

Deno’s explicit permission system is one of its most distinctive features — scripts cannot access the network, filesystem, or environment variables without explicit flags. For Ollama integration, the relevant permissions are --allow-net (to connect to localhost:11434) and optionally --allow-read (if reading files to process) and --allow-env (if reading environment variables for configuration). This explicit permission model is a security feature — a malicious or compromised dependency cannot silently exfiltrate data to an external server without the network permission being granted. For LLM tools that process sensitive content (documents, notes, emails), Deno’s permission model provides a layer of reassurance that the script is doing only what it appears to do.

In practice, the permission flags become second nature quickly and add minimal friction to the development workflow. You can also specify permissions in a deno.json file’s task definitions, so running a task like deno task chat automatically includes the right permissions without typing them each time. For production scripts, the principle of least privilege applies — grant only the permissions the script actually needs rather than using broad --allow-all, which defeats the security purpose of the permission model.

Using Deno Deploy or Bun for Serverless Ollama Wrappers

A practical pattern for teams is hosting a thin Deno or Bun HTTP server that wraps your local Ollama instance, adding authentication, logging, and rate limiting. Team members hit the wrapper endpoint rather than Ollama directly, which keeps Ollama’s unauthenticated API internal while exposing a managed interface:

// server.ts — Bun HTTP server wrapping Ollama
import ollama from 'ollama';

const AUTH_TOKEN = process.env.AUTH_TOKEN ?? 'dev-token';

Bun.serve({
  port: 8090,
  async fetch(req) {
    // Basic auth
    if (req.headers.get('Authorization') !== `Bearer ${AUTH_TOKEN}`) {
      return new Response('Unauthorized', { status: 401 });
    }
    if (req.method !== 'POST' || new URL(req.url).pathname !== '/chat') {
      return new Response('Not Found', { status: 404 });
    }
    const { messages, model = 'llama3.2' } = await req.json();
    const stream = new ReadableStream({
      async start(controller) {
        const gen = await ollama.chat({ model, messages, stream: true });
        for await (const chunk of gen) {
          controller.enqueue(new TextEncoder().encode(chunk.message.content));
        }
        controller.close();
      }
    });
    return new Response(stream, {
      headers: { 'Content-Type': 'text/plain; charset=utf-8' }
    });
  }
});
console.log('Ollama wrapper running on :8090');

// Run: AUTH_TOKEN=mytoken bun run server.ts

Deno’s Fresh Framework with Ollama

Deno’s Fresh web framework (a server-side rendering framework with islands architecture) pairs naturally with Ollama for building local AI web applications. Fresh’s zero-configuration approach — no build step, TypeScript by default, JSX without compilation — matches Ollama’s simplicity. A Fresh + Ollama application can be running in under ten minutes with no webpack config, no tsconfig adjustments, and no dependency installation beyond deno run -A -r https://fresh.deno.dev my-ai-app. For developers who want to build a local AI web interface without the Node.js/React/webpack complexity, this is the most frictionless path to a working web application backed by a local LLM.

Performance Comparison for CLI Scripts

For local LLM scripts where startup time is visible (running a short script from the terminal), the runtime choice affects the perceived snappiness of the tool. Bun starts in under 5ms, making it essentially instantaneous. Deno starts in 20–50ms for a cached script. Node.js starts in 50–100ms. For interactive CLI tools where you run the script many times per session, Bun’s startup advantage is genuinely noticeable. For long-running servers or batch processing scripts where startup time is amortised over minutes of runtime, the difference is irrelevant. The other performance dimension — the speed of the HTTP calls to Ollama — is identical across all three runtimes, since Ollama’s inference speed is the bottleneck and not the client’s HTTP implementation.

Practical Workflow: Bun + Ollama CLI in 5 Minutes

To experience the full simplicity of the Bun + Ollama combination: install Bun (curl -fsSL https://bun.sh/install | bash), create a new directory, run bun add ollama, and paste the streaming chat example from this article into chat.ts. Run it with bun run chat.ts. The entire workflow — from a fresh machine to a running local AI chat script — takes under five minutes with no virtual environments, no tsconfig.json, no webpack, and no Python. For developers who have been frustrated by the Python toolchain complexity typical of LLM development, this is a genuinely refreshing alternative that produces professional-quality scripts with minimal setup friction.

The Deno equivalent is even simpler for quick experiments: there is no install step, no package.json, and no node_modules directory. Run deno run --allow-net https://gist.github.com/yourusername/your-gist to run an Ollama script directly from a URL with no local installation at all. This makes sharing Ollama scripts with colleagues as simple as sharing a URL — the recipient runs the script with one command and no setup beyond having Deno and Ollama installed. For teams that want to share internal AI tools without setting up a package registry or repository, this URL-based execution model is a practical distribution mechanism that Deno uniquely supports.

Getting Started

If you are already a JavaScript developer comfortable with either Deno or Bun, the path to local AI integration is straightforward: install the ollama package, ensure Ollama is running locally, and copy the examples from this article. The library API is identical to the Node.js version, all TypeScript types are included, and streaming works with the same for-await loop pattern. The only Ollama-specific consideration is that your model names will be Ollama model names (llama3.2, qwen2.5-coder:7b) rather than OpenAI model names — and the base_url change for the OpenAI SDK compatibility layer, if you prefer that interface. Either path works well, and the choice comes down to which API style you find more natural for your specific project.

The Bigger Picture: JavaScript-First AI Development

The availability of a polished official Ollama library for JavaScript — one that works across Node.js, Deno, and Bun without modification — signals a maturing of the local AI ecosystem beyond its Python-first origins. For the large population of developers who work primarily in JavaScript and TypeScript, this removes the main barrier to local LLM integration: the need to learn Python tooling, manage virtual environments, and maintain a language boundary between the AI backend and the application frontend. Local AI development in JavaScript is now a first-class experience, not a workaround using curl commands or raw HTTP calls. Deno and Bun accelerate this further by reducing the tooling overhead to near zero — the gap between having an idea for a local AI script and having a working implementation has compressed to minutes for JavaScript developers comfortable with async/await and TypeScript.

For developers evaluating whether to invest time in local LLM integration, this runtime flexibility removes a significant objection. You do not need to learn Python to use Ollama. You do not need to abandon your existing JavaScript toolchain. You can build on top of your existing knowledge and project structure, adding local AI capabilities incrementally rather than through a parallel Python service that your JavaScript application has to call. That accessibility is ultimately what will drive broader adoption of local AI beyond the subset of developers who are already comfortable with Python ML tooling. The scripts and patterns in this article give you a working foundation for any of the three runtimes — pick the one that fits your existing workflow, and the Ollama integration will feel like a natural extension of the JavaScript development you already do rather than a detour into unfamiliar territory.

As Ollama’s model library continues to expand and Deno and Bun’s ecosystems mature, the combination will only become more capable — new models appear in Ollama within days of release, and both runtimes ship meaningful performance improvements with each version. Staying current with all three requires nothing more than occasional version checks: bun upgrade, deno upgrade, and ollama pull [model] to pick up the latest improvements automatically. The convergence of simpler runtimes, simpler inference backends, and improving model quality is making JavaScript a genuinely first-class language for local AI development in 2026 — and that trend will only continue as models improve and the tooling matures further — bringing genuinely useful AI capabilities within reach of every developer regardless of their language background or prior experience with machine learning infrastructure — that is the promise these tools are delivering on.

Leave a Comment