Generative AI Archives - Page 13 of 70

How to Use Ollama with Kotlin

May 30, 2026 by mljourney

Kotlin has become the language of choice for Android development and is increasingly popular on the server side thanks to frameworks like Ktor and Spring Boot. If you are building a Kotlin application and want to add local LLM capabilities without routing requests through a cloud API, Ollama is the most straightforward way to do … Read more

How to Build a Discord Bot with Ollama

May 30, 2026 by mljourney

Discord has become one of the most popular platforms for developer communities, gaming groups, and hobbyist projects alike. If you’re already running a local LLM with Ollama, building a Discord bot that connects to it is a natural next step — you get a private, free AI assistant available to your entire server, with no … Read more

How to Build a Chat UI for Ollama with Gradio

May 29, 2026 by mljourney

A practical guide to building Ollama chat interfaces with Gradio: a basic ChatInterface with conversation history, a streaming version using Generator to display tokens as they arrive, a model selector dropdown that reads available Ollama models dynamically, and deployment options including LAN sharing, the Gradio public tunnel, and running as a background service.

How to Evaluate Ollama Prompts with Langfuse

May 28, 2026 by mljourney

A complete guide to evaluating Ollama prompts with Langfuse: self-hosting Langfuse with Docker, wrapping ollama.chat with trace and generation spans that record prompts, responses, and token usage, versioning and A/B testing prompts to compare output quality across versions, recording quality scores from human raters or an automated judge model, and using the Langfuse dashboard alongside Prometheus for comprehensive AI observability.

How to Generate Git Commit Messages with a Local LLM

May 27, 2026 by mljourney

A practical guide to generating git commit messages with Ollama: reading staged diffs with git diff –cached, a conventional commits format prompt with low temperature, a prepare-commit-msg git hook that prepends AI suggestions as editable comments, and an interactive CLI tool with accept, edit, regenerate, and quit options for full control over every commit.

How to Build an AI Stack with Ollama and Docker Compose

May 26, 2026 by mljourney

A complete guide to composing Ollama with other services in Docker Compose: a full stack with Ollama, Open WebUI, FastAPI, and pgvector Postgres, a model init container, CPU-only variant, FastAPI Dockerfile and client using the compose service hostname, health checks with service_healthy dependency condition, and essential compose commands for managing models and containers.

How to Draft Emails with a Local LLM

May 25, 2026 by mljourney

A practical guide to email drafting with a local LLM: a basic draft_email function with tone and length parameters, subject line generation, drafting replies to existing emails with intent specification, batch personalisation of email templates for multiple recipients with company and role context, and a reference of effective tone and style prompt parameters for reliably varied email outputs.

How to Filter and Deduplicate Pretraining Data for LLMs

May 25, 2026 by mljourney

A practical guide to LLM pretraining data pipelines: language identification with FastText, heuristic quality filtering using character-to-word ratios, symbol ratios, and repeated line detection, perplexity-based filtering with KenLM to catch templated and garbled text, MinHash LSH deduplication with datasketch, exact substring deduplication with suffix arrays, building a full pipeline with HuggingFace datatrove including Gopher and C4 quality filters, training a fastText classifier for quality scoring, and balancing the data mix across web, books, and code sources.

How to Stream Ollama Responses over WebSockets

May 24, 2026 by mljourney

A complete guide to streaming Ollama token output to browser clients via WebSocket: why WebSockets suit interactive AI chat better than SSE, a FastAPI WebSocket endpoint using run_in_executor for sync Ollama, a fully async version using httpx streaming, a vanilla JS browser client with real-time token display and stop button, and a multi-client broadcast connection manager for shared AI sessions.

Model Merging: Weight Averaging, TIES, and DARE Explained

May 24, 2026 by mljourney

A practical guide to model merging for ML engineers: how linear weight averaging and model soups work, computing and applying task vectors, TIES merging with trimming and sign election to resolve conflicts between task vectors, DARE with random dropout and rescaling before merging, combining DARE with TIES for large task vectors, using mergekit with a YAML config for production merges, SLERP for smoother two-model interpolation, and a decision guide for choosing between merging methods based on task overlap and fine-tuning intensity.