How to Use Ollama with Nuxt and Vue

Introduction

Nuxt and Vue are two of the most developer-friendly frameworks in the JavaScript ecosystem. Nuxt builds on Vue to provide server-side rendering, file-based routing, and a powerful module system, making it ideal for building full-stack web applications. When you combine Nuxt and Vue with Ollama — the tool that lets you run large language models locally on your own machine — you get a powerful stack for building AI-powered web apps without sending data to external APIs.

In this guide you will learn how to set up Ollama on your machine, connect a Vue component to the Ollama API, and then build a simple Nuxt application that streams responses from a local LLM directly into your browser. All inference happens on your own hardware, which means no API keys, no usage costs, and no data leaving your machine.

Prerequisites

Before you begin, make sure you have the following installed and ready:

Node.js 18+ — Nuxt 3 requires Node 18 or later. Check your version with node -v.
Ollama — Download and install Ollama from ollama.com. Once installed, pull a model: ollama pull llama3.2.
A terminal — You will run a few commands to scaffold the project and start the dev server.

Verify Ollama is running before you start. Open a terminal and run ollama serve if it is not already running as a background service. You should be able to visit http://localhost:11434 and see a response from the Ollama server.

Scaffolding a New Nuxt Project

Create a new Nuxt 3 project using the official scaffolding tool:

npx nuxi@latest init ollama-nuxt
cd ollama-nuxt
npm install

This creates a minimal Nuxt 3 project with a default app layout. Open the project in your editor and take a moment to look at the structure. The app.vue file is the root component, and you will add pages and components to the pages/ and components/ directories.

Calling the Ollama API from Vue

Ollama exposes a simple REST API on http://localhost:11434. The two endpoints you will use most often are /api/generate for single-turn completions and /api/chat for multi-turn conversations. Both support streaming responses via newline-delimited JSON.

In a plain Vue component, you can call the Ollama API using the native fetch API. Here is a minimal Vue 3 component that sends a prompt and reads the streamed response:

<script setup>
import { ref } from 'vue'

const prompt = ref('')
const response = ref('')
const loading = ref(false)

async function generate() {
  loading.value = true
  response.value = ''
  const res = await fetch('http://localhost:11434/api/generate', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ model: 'llama3.2', prompt: prompt.value, stream: true })
  })
  const reader = res.body.getReader()
  const decoder = new TextDecoder()
  while (true) {
    const { done, value } = await reader.read()
    if (done) break
    const lines = decoder.decode(value).split('\n').filter(Boolean)
    for (const line of lines) {
      const data = JSON.parse(line)
      if (data.response) response.value += data.response
    }
  }
  loading.value = false
}
</script>

<template>
  <div>
    <textarea v-model="prompt" placeholder="Ask something..." />
    <button @click="generate" :disabled="loading">
      {{ loading ? 'Generating...' : 'Generate' }}
    </button>
    <pre>{{ response }}</pre>
  </div>
</template>

This component uses Vue’s ref for reactive state and reads the streamed response line by line. Each line from the Ollama API is a JSON object with a response field containing the next token. By appending each token to the response ref, you get a live typewriter effect in the UI.

Creating a Nuxt Page for the Chat UI

Now let’s integrate this into a proper Nuxt page. First, enable the pages directory by removing or editing app.vue to use <NuxtPage />:

<!-- app.vue -->
<template>
  <NuxtPage />
</template>

Then create a new page at pages/index.vue:

<script setup>
import { ref } from 'vue'

const model = ref('llama3.2')
const prompt = ref('')
const messages = ref([])
const loading = ref(false)

async function send() {
  if (!prompt.value.trim()) return
  const userMsg = prompt.value.trim()
  messages.value.push({ role: 'user', content: userMsg })
  prompt.value = ''
  loading.value = true
  const assistantMsg = { role: 'assistant', content: '' }
  messages.value.push(assistantMsg)

  const res = await fetch('http://localhost:11434/api/chat', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      model: model.value,
      messages: messages.value.filter(m => m.role !== 'assistant' || m.content),
      stream: true
    })
  })
  const reader = res.body.getReader()
  const decoder = new TextDecoder()
  while (true) {
    const { done, value } = await reader.read()
    if (done) break
    const lines = decoder.decode(value).split('\n').filter(Boolean)
    for (const line of lines) {
      const data = JSON.parse(line)
      if (data.message?.content) assistantMsg.content += data.message.content
    }
  }
  loading.value = false
}
</script>

<template>
  <div class="chat-container">
    <h1>Ollama Chat</h1>
    <select v-model="model">
      <option value="llama3.2">llama3.2</option>
      <option value="mistral">mistral</option>
      <option value="gemma3">gemma3</option>
    </select>
    <div class="messages">
      <div v-for="(msg, i) in messages" :key="i" :class="msg.role">
        <strong>{{ msg.role }}: </strong>{{ msg.content }}
      </div>
    </div>
    <div class="input-row">
      <input v-model="prompt" @keyup.enter="send" placeholder="Type a message..." />
      <button @click="send" :disabled="loading">Send</button>
    </div>
  </div>
</template>

Using a Nuxt Server Route to Proxy Ollama

One issue with calling Ollama directly from the browser is that the Ollama server is only available on localhost. If you ever deploy your Nuxt app to a server or share it on a local network, the browser on the remote machine cannot reach your local Ollama instance. The solution is to proxy Ollama requests through a Nuxt server route.

Create a file at server/api/ollama.post.ts:

export default defineEventHandler(async (event) => {
  const body = await readBody(event)
  const response = await fetch('http://localhost:11434/api/chat', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify(body)
  })
  setResponseHeaders(event, {
    'Content-Type': 'application/x-ndjson',
    'Transfer-Encoding': 'chunked'
  })
  return sendStream(event, response.body)
})

Now update your Vue component to call /api/ollama instead of the Ollama server directly:

const res = await fetch('/api/ollama', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ model: model.value, messages: messages.value, stream: true })
})

With this setup, the browser always talks to your Nuxt server, and the Nuxt server proxies the request to Ollama on the backend. This pattern also makes it easy to add authentication, rate limiting, or model selection logic on the server side without changing the client code.

Adding a Composable for Reusable Ollama Logic

As your app grows, you will likely want to use Ollama in multiple pages and components. Nuxt’s composable pattern is a great way to encapsulate the Ollama API logic and share it across your app. Create a file at composables/useOllama.ts:

import { ref } from 'vue'

export function useOllama(model = 'llama3.2') {
  const response = ref('')
  const loading = ref(false)
  const error = ref(null)

  async function generate(prompt) {
    loading.value = true
    error.value = null
    response.value = ''
    try {
      const res = await fetch('/api/ollama', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ model, prompt, stream: true })
      })
      const reader = res.body.getReader()
      const decoder = new TextDecoder()
      while (true) {
        const { done, value } = await reader.read()
        if (done) break
        const lines = decoder.decode(value).split('\n').filter(Boolean)
        for (const line of lines) {
          const data = JSON.parse(line)
          if (data.message?.content) response.value += data.message.content
        }
      }
    } catch (e) {
      error.value = e.message
    } finally {
      loading.value = false
    }
  }

  return { response, loading, error, generate }
}

Now in any page or component you can simply do:

<script setup>
const { response, loading, generate } = useOllama('llama3.2')
</script>

Nuxt auto-imports composables from the composables/ directory, so you do not even need to import the function manually — it is available everywhere in your app.

Styling the Chat Interface

A bare-bones chat UI works, but a bit of styling makes it much more usable. Nuxt supports scoped styles in single-file components, and you can also use any CSS framework such as Tailwind CSS. Here is a minimal scoped style block to add to your pages/index.vue:

<style scoped>
.chat-container {
  max-width: 720px;
  margin: 2rem auto;
  font-family: sans-serif;
}
.messages {
  border: 1px solid #e0e0e0;
  border-radius: 8px;
  padding: 1rem;
  min-height: 300px;
  max-height: 500px;
  overflow-y: auto;
  margin-bottom: 1rem;
  background: #fafafa;
}
.user { color: #1a56db; margin-bottom: 0.75rem; }
.assistant { color: #111; margin-bottom: 0.75rem; }
.input-row { display: flex; gap: 0.5rem; }
.input-row input { flex: 1; padding: 0.5rem; border: 1px solid #ccc; border-radius: 4px; }
.input-row button { padding: 0.5rem 1rem; background: #1a56db; color: white; border: none; border-radius: 4px; cursor: pointer; }
.input-row button:disabled { background: #aaa; }
</style>

Switching Models at Runtime

One of the best features of Ollama is that you can run multiple models and switch between them with a single string change. In your Nuxt app, you can expose a model selector to let users choose which model to use for each conversation. The select element in the template above already does this — just make sure you have pulled the models you want to offer:

ollama pull llama3.2
ollama pull mistral
ollama pull gemma3

You can list all locally available models by running ollama list, or by calling the Ollama API endpoint GET /api/tags. To populate your model selector dynamically from the API, add a server route at server/api/models.get.ts:

export default defineEventHandler(async () => {
  const res = await fetch('http://localhost:11434/api/tags')
  const data = await res.json()
  return data.models.map(m => m.name)
})

Then in your component, use useFetch to populate the select:

const { data: models } = await useFetch('/api/models')

This gives you a fully dynamic model list that reflects exactly what is installed on the machine running Ollama.

Conclusion

Combining Ollama with Nuxt and Vue gives you a productive, privacy-preserving stack for building AI-powered web applications. You can develop and test locally with no API costs, and the Nuxt server route pattern makes it straightforward to proxy Ollama securely when you are ready to deploy. The composable pattern keeps your Ollama logic reusable and testable across a growing application. From here, you could add conversation history persistence, system prompt configuration, or even a RAG pipeline that feeds local documents to the model — all running entirely on your own infrastructure.

Tips for Production Use

When moving beyond local development, there are a few practical considerations to keep in mind. First, Ollama’s default server binds only to localhost, so you will need to set the OLLAMA_HOST environment variable to 0.0.0.0 if you want it accessible on a network interface. Second, streaming responses can be interrupted if a reverse proxy like Nginx is not configured to pass through chunked transfer encoding — set proxy_buffering off in your Nginx config. Third, consider adding a simple API key check in your Nuxt server route to prevent unauthorized access to your Ollama instance if you expose it beyond localhost. These small adjustments make the difference between a toy project and a reliable internal tool your team can actually use day to day.