How to Add Tracing to Ollama with OpenTelemetry
A practical guide to instrumenting Ollama calls with OpenTelemetry: setting up the OTel SDK with OTLP exporter and Jaeger, a traced_chat wrapper function that records model name, token counts, latency, and tokens per second as span attributes, tracing a full RAG pipeline with nested spans for embedding, retrieval, and generation, viewing traces in the Jaeger UI, and exporting to cloud backends including Honeycomb and Grafana Tempo.