Bash is the glue language of Linux and macOS systems, and Ollama’s HTTP API is simple enough to call with nothing more than curl. This means you can integrate a local LLM into shell scripts, cron jobs, aliases, and pipelines without installing any additional tools or language runtimes. This guide covers the essential patterns for using Ollama from Bash — one-shot queries, streaming output, piping file content as context, processing multiple files in a loop, and building reusable shell functions you can source into any script.
Shell-based Ollama integration is particularly useful for tasks already in the terminal: summarising log files, generating commit messages from diffs, explaining error output, classifying files, or converting content formats. Keeping the LLM call in Bash means the output flows naturally through pipes and redirections to other shell tools, making it easy to compose AI-powered steps into larger automation pipelines.
The Simplest Possible Query
The absolute minimum to query Ollama from Bash is a single curl command:
curl -s http://localhost:11434/api/chat \
-d '{"model":"llama3.2","messages":[{"role":"user","content":"What is bash?"}],"stream":false}' \
| python3 -c "import sys,json; print(json.load(sys.stdin)['message']['content'])"The python3 pipe extracts the content field from the JSON response. If you have jq installed — and you should, it is the standard JSON processor for shell scripts — the extraction is cleaner:
curl -s http://localhost:11434/api/chat \
-d '{"model":"llama3.2","messages":[{"role":"user","content":"What is bash?"}],"stream":false}' \
| jq -r '.message.content'Install jq with sudo apt install jq on Debian/Ubuntu or brew install jq on macOS. It is worth having for any serious shell scripting work, and we will use it throughout this guide.
A Reusable ollama_ask Function
Wrap the curl call in a shell function you can source into your .bashrc or any script:
#!/usr/bin/env bash
OLLAMA_URL="${OLLAMA_URL:-http://localhost:11434}"
OLLAMA_MODEL="${OLLAMA_MODEL:-llama3.2}"
ollama_ask() {
local prompt="$1"
local model="${2:-$OLLAMA_MODEL}"
local escaped
escaped=$(printf '%s' "$prompt" | python3 -c "import sys,json; print(json.dumps(sys.stdin.read()))")
curl -s "$OLLAMA_URL/api/chat" \
-H "Content-Type: application/json" \
-d "{"model":"$model","messages":[{"role":"user","content":$escaped}],"stream":false}" \
| jq -r '.message.content'
}The python3 -c JSON-escaping step is important — without it, prompts containing quotes, newlines, or backslashes will break the JSON payload and produce a curl error or garbled response. Using Python’s json.dumps handles all escaping correctly in one line. After sourcing this file with source ollama.sh, you can use ollama_ask "your question" anywhere in the same shell session.
Piping File Content as Context
To send a file’s contents as context, read it into a variable and include it in the prompt. The JSON escaping step handles multi-line file contents correctly:
ollama_file() {
local instruction="$1"
local file="$2"
local content
content=$(cat "$file")
local prompt="$instruction
$content"
ollama_ask "$prompt"
}
# Examples:
ollama_file "Summarise this log file in 3 bullet points:" /var/log/syslog
ollama_file "Explain what this script does:" ./deploy.sh
ollama_file "List any security issues in this config:" /etc/nginx/nginx.confKeep in mind that Ollama’s context window has a limit — for most models it is 8k to 128k tokens depending on the model. Very large files will be silently truncated or cause an error. For log files or other large text, pipe the content through head or tail first to limit input size: content=$(tail -n 200 "$file") sends only the last 200 lines, which is usually the most relevant part for log analysis.
Streaming Output to the Terminal
To stream tokens as they are generated rather than waiting for the full response, enable streaming in the API call and process the newline-delimited JSON output line by line:
ollama_stream() {
local prompt="$1"
local model="${2:-$OLLAMA_MODEL}"
local escaped
escaped=$(printf '%s' "$prompt" | python3 -c "import sys,json; print(json.dumps(sys.stdin.read()))")
curl -s "$OLLAMA_URL/api/chat" \
-H "Content-Type: application/json" \
-d "{"model":"$model","messages":[{"role":"user","content":$escaped}],"stream":true}" \
| while IFS= read -r line; do
token=$(echo "$line" | jq -r '.message.content // empty')
printf '%s' "$token"
done
echo # final newline
}The while IFS= read -r line loop reads one JSON object per line from curl’s output. jq -r '.message.content // empty' extracts the token and outputs nothing for lines where the field is absent (the done message at the end). The printf '%s' without a newline prints tokens consecutively, giving the typewriter streaming effect in the terminal. Note that jq is called once per line here, which works but has overhead — for high-volume processing, consider using Python’s json module in a single subprocess instead.
Practical Scripts
Here are four immediately useful scripts built on the functions above.
Git commit message generator — pass the staged diff to Ollama and print a suggested commit message:
#!/usr/bin/env bash source ~/ollama.sh diff=$(git diff --cached) if [ -z "$diff" ]; then echo "No staged changes."; exit 1 fi ollama_ask "Write a concise git commit message for this diff. Output only the message, no explanation: $diff"
Log analyser — summarise the last 100 lines of a log file:
#!/usr/bin/env bash
source ~/ollama.sh
LOG="${1:-/var/log/syslog}"
ollama_ask "Summarise the key events and any errors in these log lines:
$(tail -n 100 "$LOG")"Command explainer — explain what a shell command does:
#!/usr/bin/env bash source ~/ollama.sh ollama_ask "Explain what this shell command does, step by step: $*"
File classifier — classify files in a directory by type and purpose:
#!/usr/bin/env bash
source ~/ollama.sh
DIR="${1:-.}"
for f in "$DIR"/*; do
[ -f "$f" ] || continue
result=$(ollama_ask "In 10 words or less, what is this file? Filename: $(basename "$f")")
printf "%-40s %s
" "$(basename "$f")" "$result"
done
Processing Multiple Files in Parallel
For batch processing, use Bash’s background job operator and wait to process multiple files concurrently:
#!/usr/bin/env bash
source ~/ollama.sh
DIR="${1:-.}"
MAX_JOBS=4
job_count=0
for f in "$DIR"/*.txt; do
[ -f "$f" ] || continue
(
result=$(ollama_ask "Summarise this document in one sentence:
$(cat "$f")")
echo "$f: $result"
) &
job_count=$((job_count + 1))
if [ "$job_count" -ge "$MAX_JOBS" ]; then
wait -n 2>/dev/null || wait
job_count=$((job_count - 1))
fi
done
waitThe MAX_JOBS=4 limit prevents overwhelming Ollama with too many simultaneous requests. Ollama queues concurrent requests internally, so they will not fail, but running more than 4 or 5 at once rarely improves throughput since Ollama processes them sequentially — the overhead of managing many queued connections outweighs any benefit. The wait -n command (available in Bash 4.3+) waits for any one background job to complete before launching the next, maintaining a rolling window of active jobs. On macOS, the default Bash is 3.x — install Bash 5 via Homebrew to use wait -n.
Getting Structured JSON Output
For scripts that need to parse the model’s response programmatically, use Ollama’s JSON schema mode to guarantee structured output:
ollama_json() {
local prompt="$1"
local schema="$2"
local model="${3:-$OLLAMA_MODEL}"
local escaped_prompt escaped_schema
escaped_prompt=$(printf '%s' "$prompt" | python3 -c "import sys,json; print(json.dumps(sys.stdin.read()))")
escaped_schema=$(printf '%s' "$schema" | python3 -c "import sys,json; print(json.dumps(sys.stdin.read()))")
curl -s "$OLLAMA_URL/api/chat" \
-H "Content-Type: application/json" \
-d "{"model":"$model","messages":[{"role":"user","content":$escaped_prompt}],"format":$schema,"stream":false}" \
| jq -r '.message.content'
}
# Usage: extract sentiment from text
schema='{"type":"object","properties":{"label":{"type":"string","enum":["positive","negative","neutral"]},"confidence":{"type":"number"}},"required":["label","confidence"]}'
result=$(ollama_json "Classify the sentiment: Great product, highly recommend!" "$schema")
echo "$result" | jq .Note that the format field takes the schema object directly, not a JSON string of the schema — which is why escaped_schema is inserted without extra quotes in the -d payload. The response content is a JSON string you can pipe directly to jq for further processing or assign fields to shell variables with jq -r '.label'.
Adding to .bashrc for Daily Use
Save the core functions to ~/.ollama.sh and source it from your .bashrc or .zshrc:
# Add to ~/.bashrc export OLLAMA_MODEL="llama3.2" export OLLAMA_URL="http://localhost:11434" [ -f ~/.ollama.sh ] && source ~/.ollama.sh # Convenient aliases alias ai='ollama_stream' # ai "explain grep" alias aifile='ollama_file' # aifile "summarise" report.txt alias explain='ollama_ask "Explain this shell command: $*"'
With these aliases active, ai "what does awk do" gives you a streaming explanation in the terminal, and aifile "find bugs in" script.py sends the file with a review instruction. The OLLAMA_MODEL environment variable lets you switch models for the session with a single export — export OLLAMA_MODEL=qwen2.5-coder:7b before running a code-related task, for example, without editing any scripts.
Error Handling and Fallbacks
Shell scripts calling external services should handle failures gracefully. Add a health check at the start of any script that depends on Ollama:
check_ollama() {
if ! curl -sf "$OLLAMA_URL/api/tags" > /dev/null 2>&1; then
echo "Error: Ollama is not running at $OLLAMA_URL" >&2
echo "Start it with: ollama serve" >&2
exit 1
fi
}
# Call at the top of any script:
check_ollamaThe -sf flags make curl silent and treat HTTP errors as failures. If Ollama is not running, the script exits immediately with a clear message rather than producing a confusing JSON parse error from the failed curl response. This pattern is essential for scripts run via cron or as part of automated pipelines where there is no interactive terminal to display cryptic error messages.
Using Ollama with stdin Pipes
One of the most natural ways to use Ollama in shell workflows is reading the prompt from stdin, which lets you pipe output from other commands directly into the LLM. Extend the core function to read from stdin when no argument is provided:
ollama_pipe() {
local instruction="${1:-Summarise this:}"
local input
input=$(cat)
ollama_ask "$instruction
$input"
}
# Examples:
ps aux | grep python | ollama_pipe "What python processes are running?"
git log --oneline -20 | ollama_pipe "Write a summary of recent changes"
df -h | ollama_pipe "Which filesystems are nearly full?"
curl -s https://api.github.com/repos/ollama/ollama/releases/latest \
| jq '.body' | ollama_pipe "Summarise this release note"The pipe pattern is powerful because it composes naturally with every other Unix tool. Anything that produces text output — ps, df, netstat, git log, API responses, file listings — can be fed to Ollama for analysis, summarisation, or classification with a two-part pipeline: the tool that generates the data, and ollama_pipe with an instruction. This is the pattern that makes shell-based Ollama integration qualitatively different from a web chat interface — the LLM becomes a processing step in your automation pipeline rather than a separate tool you switch to.
Cron and Automated Reporting
Because Ollama is a local HTTP service that runs continuously in the background, you can call it from cron jobs without any authentication or network requirements. A daily log summary sent to your email is a practical example:
#!/usr/bin/env bash # /etc/cron.daily/ollama-log-summary source /home/youruser/.ollama.sh check_ollama || exit 0 # Skip silently if Ollama not running DATE=$(date +%Y-%m-%d) LOG_SUMMARY=$(tail -n 500 /var/log/syslog | \ ollama_pipe "Summarise key events and any errors from today's system log. Be concise.") echo "Log summary for $DATE: $LOG_SUMMARY" | \ mail -s "Daily Server Summary $DATE" admin@example.com
The check_ollama || exit 0 pattern exits silently if Ollama is not running rather than generating a cron error email. This is important for automated scripts — failing loudly when Ollama is temporarily unavailable (after a reboot, for example) would generate false-alarm alerts. Cron also sets a minimal environment with no shell aliases or sourced files, so the script must source ~/.ollama.sh with an absolute path rather than relying on the interactive shell configuration.
Model Selection from the Command Line
Add a wrapper that lists available models and lets you select one interactively using fzf, the fuzzy finder that integrates cleanly with shell workflows:
ollama_select_model() {
local model
model=$(curl -s "$OLLAMA_URL/api/tags" \
| jq -r '.models[].name' \
| fzf --prompt="Select model: " --height=10)
if [ -n "$model" ]; then
export OLLAMA_MODEL="$model"
echo "Model set to: $OLLAMA_MODEL"
fi
}Install fzf with brew install fzf or apt install fzf. Binding ollama_select_model to a key sequence in your shell — bind -x '"\C-o": ollama_select_model' in .bashrc — lets you switch the active model with a keyboard shortcut at any point during your terminal session. The new model is exported as an environment variable so all subsequent ollama_ask calls in the same session use it automatically.
Measuring and Controlling Response Length
LLMs tend to produce verbose responses by default. For shell scripts where the output feeds into other tools or gets stored in variables, shorter responses are almost always better. Control response length through the prompt rather than through API parameters — Ollama does not expose a max_tokens parameter in its chat API in the same way OpenAI does, but the model responds well to explicit length constraints in the prompt itself.
Adding “in one sentence”, “in 10 words or less”, “as a single line”, or “as a comma-separated list” to your prompts reliably produces more compact output. For structured output intended for further shell processing, request the response as a single JSON object or a plain list with one item per line — both formats are easy to parse with jq or standard shell tools like awk and cut. Prompts that specify both the format and the maximum length consistently produce the most shell-friendly output from any model.
Bash vs Python for Shell LLM Scripting
Bash is the right choice when your script is primarily about orchestrating other command-line tools — running commands, piping output, checking exit codes, and processing file paths. The Ollama call is one step in a larger pipeline, and Bash’s strength is composing those steps cleanly. Python becomes the better choice when the script needs to do significant text processing on the model’s output, maintain complex state across multiple calls, handle retries and timeouts robustly, or parse structured data beyond what jq handles comfortably. The two approaches are complementary rather than competing — a common pattern is a Bash script that orchestrates the overall workflow (finding files, running commands, sending notifications) and delegates to a small Python helper for the Ollama call and any complex response parsing. Starting with Bash and extracting to Python when the logic gets complex is a practical approach that keeps simple scripts simple while giving you a clear upgrade path for scripts that grow in complexity.
Whichever approach you use, the core loop is the same: build a prompt from the available context, call the Ollama API, parse the response, and feed the output into the next step of your pipeline. Bash makes each of those steps visible and composable with standard Unix tools. Once that loop is working, the rest is just applying it to whatever automation problems your workflow has.
The full set of functions described in this guide — ollama_ask, ollama_stream, ollama_file, ollama_pipe, ollama_json, check_ollama, and ollama_select_model — fits in under 80 lines of Bash. Save them to ~/.ollama.sh, source that file from your shell profile, and you have a local AI toolkit available in every terminal session on your machine. No web browser, no API key, no cloud costs — just Ollama, curl, and jq working together the way Unix tools were designed to.
Happy scripting.