How to Use Ollama with Puppeteer

Puppeteer is Google’s Node.js library for controlling a headless Chrome browser. Paired with Ollama, it gives you a powerful combination for automating web tasks that require intelligence: scraping content that needs JavaScript execution, extracting structured data from dynamic pages, automating form submissions with AI-generated content, and building screenshot-based visual analysis pipelines. This guide covers the core Puppeteer and Ollama integration patterns using Node.js — from basic page scraping and summarisation to structured extraction, site monitoring, and visual analysis with a vision model.

Puppeteer and Playwright serve similar purposes — both control a browser programmatically — but Puppeteer has been around longer, has a larger ecosystem of community plugins, and is the de facto standard in the Node.js scraping community. If you are already in a Node.js project or ecosystem, Puppeteer is the natural choice.

Setup

npm init -y
npm install puppeteer node-fetch
ollama pull llama3.2
ollama pull llava  # for vision analysis

Puppeteer downloads a compatible version of Chromium automatically during installation. The node-fetch package provides the fetch API for calling the Ollama HTTP endpoint from Node.js.

Scraping and Summarising

The fundamental pattern: launch a browser, navigate to a URL, extract text, send to Ollama:

const puppeteer = require('puppeteer');
const fetch = require('node-fetch');

const OLLAMA = 'http://localhost:11434';
const MODEL = 'llama3.2';

async function ask(prompt) {
  const resp = await fetch(`${OLLAMA}/api/chat`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      model: MODEL,
      messages: [{ role: 'user', content: prompt }],
      stream: false
    })
  });
  const data = await resp.json();
  return data.message.content;
}

async function scrapeAndSummarise(url) {
  const browser = await puppeteer.launch({ headless: 'new' });
  const page = await browser.newPage();
  await page.goto(url, { waitUntil: 'domcontentloaded' });

  const text = await page.evaluate(() => {
    document.querySelectorAll('nav,footer,header,aside,script,style').forEach(el => el.remove());
    return document.body.innerText;
  });

  await browser.close();

  return ask(`Summarise this web page in 5 bullet points:

${text.slice(0, 6000)}`);
}

scrapeAndSummarise('https://example.com')
  .then(console.log)
  .catch(console.error);

The headless: 'new' option uses Puppeteer’s newer headless mode introduced in Chrome 112, which is more stable and less detectable than the legacy headless mode. Removing nav, footer, header, aside, script, and style elements before extracting text keeps the content clean and prevents boilerplate from consuming context window tokens.

Structured Data Extraction

Extract structured data from pages using Ollama’s JSON schema mode:

async function extractProducts(url) {
  const browser = await puppeteer.launch({ headless: 'new' });
  const page = await browser.newPage();
  await page.goto(url, { waitUntil: 'networkidle2' });
  const text = await page.evaluate(() => document.body.innerText);
  await browser.close();

  const schema = {
    type: 'object',
    properties: {
      products: {
        type: 'array',
        items: {
          type: 'object',
          properties: {
            name: { type: 'string' },
            price: { type: 'number' },
            currency: { type: 'string' },
            in_stock: { type: 'boolean' },
            rating: { type: 'number' }
          },
          required: ['name', 'price']
        }
      }
    }
  };

  const resp = await fetch(`${OLLAMA}/api/chat`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      model: MODEL,
      messages: [{ role: 'user', content: `Extract products from:

${text.slice(0, 8000)}` }],
      format: schema,
      stream: false
    })
  });
  const data = await resp.json();
  return JSON.parse(data.message.content).products;
}

The networkidle2 wait condition waits until there are no more than 2 active network requests for at least 500ms, which ensures dynamic content loaded via API calls has finished rendering before the text is extracted. This is more reliable than domcontentloaded for JavaScript-heavy pages where the initial HTML is just a loading skeleton.

Screenshot Analysis with a Vision Model

Take a screenshot and send it to a vision-capable model for visual analysis:

const fs = require('fs');

async function analyseScreenshot(url, question = 'Describe what you see on this page') {
  const browser = await puppeteer.launch({ headless: 'new' });
  const page = await browser.newPage();
  await page.setViewport({ width: 1280, height: 900 });
  await page.goto(url, { waitUntil: 'networkidle2' });
  const screenshot = await page.screenshot({ encoding: 'base64' });
  await browser.close();

  const resp = await fetch(`${OLLAMA}/api/chat`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      model: 'llava',
      messages: [{
        role: 'user',
        content: question,
        images: [screenshot]
      }],
      stream: false
    })
  });
  const data = await resp.json();
  return data.message.content;
}

analyseScreenshot('https://example.com', 'What is the main call to action on this page?')
  .then(console.log);

Setting a consistent viewport with setViewport before navigating ensures screenshots are reproducible and comparable across runs. This is important for monitoring use cases where you want to detect visual changes to a page over time. The encoding: 'base64' option returns the screenshot as a base64 string ready to pass directly to Ollama without any file I/O.

Batch Scraping with Concurrency Control

Process multiple URLs concurrently with a limit on simultaneous browser pages:

async function batchScrape(urls, maxConcurrent = 3) {
  const browser = await puppeteer.launch({ headless: 'new' });
  const semaphore = { count: 0, queue: [] };

  const acquire = () => new Promise(resolve => {
    if (semaphore.count < maxConcurrent) {
      semaphore.count++;
      resolve();
    } else {
      semaphore.queue.push(resolve);
    }
  });

  const release = () => {
    semaphore.count--;
    if (semaphore.queue.length > 0) {
      semaphore.count++;
      semaphore.queue.shift()();
    }
  };

  const scrapeOne = async (url) => {
    await acquire();
    try {
      const page = await browser.newPage();
      await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 20000 });
      const text = await page.evaluate(() => document.body.innerText);
      await page.close();
      const summary = await ask(`Summarise in 2 sentences: ${text.slice(0, 4000)}`);
      return { url, summary, ok: true };
    } catch (e) {
      return { url, error: e.message, ok: false };
    } finally {
      release();
    }
  };

  const results = await Promise.all(urls.map(scrapeOne));
  await browser.close();
  return results;
}

Sharing a single browser instance across all pages is more efficient than launching a new browser per URL — Chromium startup time adds up across many URLs. Each URL gets its own page (tab) within the shared browser, isolated from other pages but sharing the browser process. The semaphore limits active pages to maxConcurrent, balancing throughput against memory usage.

AI-Powered Form Automation

Use Ollama to generate context-appropriate content for form fields — useful for testing forms with realistic data or automating content submission workflows you own:

async function fillFormWithAI(url, formContext) {
  const browser = await puppeteer.launch({ headless: 'new' });
  const page = await browser.newPage();
  await page.goto(url);

  // Discover form fields
  const fields = await page.evaluate(() => {
    return Array.from(document.querySelectorAll('input,textarea')).map(el => ({
      name: el.name || el.id || el.placeholder,
      type: el.type,
      placeholder: el.placeholder
    })).filter(f => f.name && f.type !== 'hidden' && f.type !== 'submit');
  });

  // Ask Ollama to generate appropriate values
  const generated = JSON.parse(await ask(
    `Generate realistic test data for these form fields: ${JSON.stringify(fields)}.
    Context: ${formContext}.
    Return as JSON object with field names as keys.`
  ));

  // Fill the fields
  for (const [name, value] of Object.entries(generated)) {
    const selector = `[name="${name}"],[id="${name}"]`;
    await page.focus(selector).catch(() => {});
    await page.type(selector, String(value), { delay: 50 }).catch(() => {});
  }

  await browser.close();
  return generated;
}

This pattern discovers form fields dynamically rather than hardcoding selectors, making it reusable across different forms. Always use this only on forms you own or have explicit permission to automate — automated form submission on third-party sites without permission violates terms of service and potentially applicable laws.

Puppeteer vs Playwright for Ollama Workflows

Both Puppeteer and Playwright control Chromium browsers from JavaScript, but there are meaningful differences. Playwright supports multiple browsers (Chromium, Firefox, WebKit) from a single API and has better support for the async patterns common in modern JavaScript. Puppeteer is Chromium-only but has a larger ecosystem of plugins for specific tasks like PDF generation, ad blocking, and stealth mode (making the browser less detectable as automated). For new projects, Playwright is generally the better choice for its multi-browser support and more ergonomic async API. For projects that need specific Puppeteer plugins or that are already using Puppeteer, the Ollama integration patterns are identical — the only differences are in the browser control API, not in how Ollama is called.

The Ollama API call is identical regardless of whether you use Puppeteer or Playwright to collect the content — a POST request to /api/chat with the extracted text as the user message. The browser automation layer and the LLM layer are completely decoupled, so you can swap one without changing the other. Choose the browser automation library based on your existing stack and specific requirements, and use the Ollama integration patterns from this guide with either library.

Stealth Mode and Anti-Detection

Many websites attempt to detect and block headless browser automation. Puppeteer’s default configuration sets several browser properties that identify it as automated — the navigator.webdriver property is set to true, the Chrome DevTools Protocol connection is detectable, and certain browser features that headless Chrome lacks (like specific graphics APIs) can be fingerprinted. The puppeteer-extra package and its puppeteer-extra-plugin-stealth plugin patch these detection vectors, making the browser significantly harder to distinguish from a real user’s Chrome.

Install stealth mode with npm install puppeteer-extra puppeteer-extra-plugin-stealth and replace your puppeteer import with the extra version: const puppeteer = require('puppeteer-extra'); puppeteer.use(require('puppeteer-extra-plugin-stealth')()). The rest of your code stays identical. Stealth mode patches over a dozen common detection vectors including navigator properties, plugin lists, WebGL fingerprinting, and Chrome-specific JavaScript APIs. It is not a guarantee against all detection — sophisticated anti-bot systems use behavioural analysis beyond simple JavaScript property checks — but it handles the most common checks used by most websites.

For Ollama-powered scraping workflows that need to access sites with bot detection, combine stealth mode with realistic behaviour: add random delays between actions using await page.waitForTimeout(Math.random() * 2000 + 1000), scroll the page before extracting content to simulate reading, and use a realistic user agent string. These behavioural signals are harder to detect than property-level fingerprinting and significantly improve success rates on sites with active bot protection.

PDF Generation with AI Content

Puppeteer can generate PDF files from any web page, including pages you construct programmatically with AI-generated content. This is useful for generating reports, invoices, or documentation where you want professional PDF output without a complex PDF library. Build an HTML template, populate it with content generated by Ollama, inject the HTML into a Puppeteer page, and render it to PDF with page.pdf().

The pattern works particularly well for automated report generation: Ollama generates the executive summary, key findings, and recommendations in Markdown, you convert the Markdown to HTML, inject it into a styled HTML template, and Puppeteer renders the full-page PDF with your company’s formatting, fonts, and logo. The output is indistinguishable from a manually formatted PDF but generated entirely automatically. For recurring reports — weekly analytics summaries, monthly financial overviews, quarterly reviews — combine this with a cron job and you have a fully automated report pipeline that produces publication-ready PDFs without any manual formatting work.

Monitoring and Change Detection

Puppeteer combined with Ollama is an excellent stack for intelligent website monitoring. Traditional monitoring tools detect when a page changes but cannot tell you whether the change is significant — a spelling correction, a price drop, a new product launch, and an accidental deletion all look the same to a hash-based change detector. Adding Ollama to the pipeline lets you describe what changed in plain English, categorise the change by type and significance, and only alert when the change matches criteria you care about.

Build a monitoring script that runs on a schedule, takes a screenshot and extracts the text of each monitored URL, compares both to the previous version stored in a SQLite database, and when a change is detected sends the old and new text to Ollama asking it to describe what changed, whether the change seems intentional, and whether it is significant enough to warrant a notification. Route significant changes to email, Slack, or a Discord channel using the appropriate webhook or API. This gives you a monitoring system that notifies you when a competitor changes their pricing, when a government regulation page is updated, when a job posting closes, or when a product comes back in stock — without alerting you to every CSS tweak and navigation menu change that traditional hash-based monitors would flag.

Handling Pagination and Infinite Scroll

Many sites paginate content across multiple pages or use infinite scroll to load content dynamically as the user scrolls. Puppeteer handles both patterns well. For traditional pagination, extract the “next page” link or button after processing each page, navigate to it, and repeat until there is no next page link. For infinite scroll, simulate scrolling with page.evaluate(() => window.scrollTo(0, document.body.scrollHeight)), wait for new content to load with page.waitForTimeout, and repeat until the page height stops increasing — which indicates all content has loaded.

When processing paginated content with Ollama, avoid sending every page to the model individually — the redundancy of headers, navigation, and boilerplate adds up. Instead, accumulate the content-specific text across all pages, strip the repeated elements, and send the combined unique content to Ollama in the chunked map-reduce pattern: summarise chunks of combined content independently, then synthesise the chunk summaries into a final result. For a 50-page paginated product catalogue this approach produces a much more coherent overall summary than processing each page in isolation.

Error Handling and Resilience

Production Puppeteer scripts fail in predictable ways: pages time out when the server is slow, network errors occur during large batch runs, and some pages throw JavaScript errors that prevent content from rendering. Wrap every page interaction in a try-catch, close the page in a finally block to prevent orphaned tabs from accumulating, and implement exponential backoff retry logic for transient failures. For large batch scraping jobs, checkpoint progress to disk after every N pages so the job can resume from where it left off if interrupted rather than starting over from the beginning.

The Ollama side can also fail — the model may time out on very long content, return malformed JSON for schema-constrained requests, or simply be unavailable if Ollama is not running. Handle these separately from browser errors: a Puppeteer timeout means the page did not load, while an Ollama timeout means the content was extracted successfully but the analysis failed. Log both error types with enough context — the URL, the error message, the timestamp — to diagnose patterns and fix systematic problems rather than just knowing that some percentage of pages failed.

Puppeteer and Ollama together make it practical to automate research, monitoring, data collection, and content analysis tasks that previously required either manual effort or expensive cloud services. The browser handles the JavaScript execution and page interaction that makes modern web content accessible; Ollama handles the language understanding that turns raw page text into structured, analysed, actionable information. Both run locally, both are free to use at any scale, and the combination is genuinely more capable than either tool alone.

Start with the basic scrape-and-summarise pattern and build from there as your use case demands.

Leave a Comment