mljourney, Author at ML Journey

How to Build a Local AI API with Ollama and FastAPI

June 1, 2026 by mljourney

FastAPI has become the go-to Python framework for building APIs quickly — it is fast, well-documented, and generates OpenAPI docs automatically. Paired with Ollama, it gives you a clean way to expose a local LLM as an HTTP API that any client can consume: a web frontend, a mobile app, a CLI tool, or another … Read more

How to Deploy Ollama with Ansible

June 1, 2026 by mljourney

Ansible is the most widely used tool for automating server configuration, and it is a natural fit for deploying Ollama across multiple machines. Whether you are setting up a single GPU workstation, a fleet of developer machines, or a homelab cluster, an Ansible playbook lets you install Ollama, configure it as a service, pull models, … Read more

How to Use Ollama with Rust

May 31, 2026 by mljourney

Rust is an increasingly popular choice for systems programming, CLI tools, and high-performance web services. If you are building a Rust application and want to add local LLM capabilities without a cloud dependency, Ollama exposes a straightforward HTTP API that any Rust HTTP client can call. This guide covers everything from basic chat completions to … Read more

How to Use Ollama with Dart and Flutter

May 31, 2026 by mljourney

Dart and Flutter have matured into a serious cross-platform development stack. If you are building a Flutter app and want to add AI capabilities without depending on a cloud API — no monthly bill, no data leaving the device or the local network — Ollama gives you a simple HTTP interface that any Dart application … Read more

How to Use Ollama with Kotlin

May 30, 2026 by mljourney

Kotlin has become the language of choice for Android development and is increasingly popular on the server side thanks to frameworks like Ktor and Spring Boot. If you are building a Kotlin application and want to add local LLM capabilities without routing requests through a cloud API, Ollama is the most straightforward way to do … Read more

How to Build a Discord Bot with Ollama

May 30, 2026 by mljourney

Discord has become one of the most popular platforms for developer communities, gaming groups, and hobbyist projects alike. If you’re already running a local LLM with Ollama, building a Discord bot that connects to it is a natural next step — you get a private, free AI assistant available to your entire server, with no … Read more

How to Build a Chat UI for Ollama with Gradio

May 29, 2026 by mljourney

A practical guide to building Ollama chat interfaces with Gradio: a basic ChatInterface with conversation history, a streaming version using Generator to display tokens as they arrive, a model selector dropdown that reads available Ollama models dynamically, and deployment options including LAN sharing, the Gradio public tunnel, and running as a background service.

How to Evaluate Ollama Prompts with Langfuse

May 28, 2026 by mljourney

A complete guide to evaluating Ollama prompts with Langfuse: self-hosting Langfuse with Docker, wrapping ollama.chat with trace and generation spans that record prompts, responses, and token usage, versioning and A/B testing prompts to compare output quality across versions, recording quality scores from human raters or an automated judge model, and using the Langfuse dashboard alongside Prometheus for comprehensive AI observability.

How to Generate Git Commit Messages with a Local LLM

May 27, 2026 by mljourney

A practical guide to generating git commit messages with Ollama: reading staged diffs with git diff –cached, a conventional commits format prompt with low temperature, a prepare-commit-msg git hook that prepends AI suggestions as editable comments, and an interactive CLI tool with accept, edit, regenerate, and quit options for full control over every commit.

How to Build an AI Stack with Ollama and Docker Compose

May 26, 2026 by mljourney

A complete guide to composing Ollama with other services in Docker Compose: a full stack with Ollama, Open WebUI, FastAPI, and pgvector Postgres, a model init container, CPU-only variant, FastAPI Dockerfile and client using the compose service hostname, health checks with service_healthy dependency condition, and essential compose commands for managing models and containers.