Ollama’s OpenAI-compatible API works from PHP using the popular openai-php/client library or direct Guzzle HTTP calls. This guide covers both approaches for integrating Ollama into PHP and Laravel applications, including streaming responses and a clean service class pattern.
Option 1: openai-php/client (Recommended for Laravel)
composer require openai-php/client
composer require openai-php/laravel # for Laravel integration
<?php
// config/openai.php (or .env)
// OPENAI_API_KEY=ollama
// OPENAI_BASE_URL=http://localhost:11434/v1
use OpenAI;
$client = OpenAI::factory()
->withApiKey('ollama')
->withBaseUri('http://localhost:11434/v1')
->make();
$response = $client->chat()->create([
'model' => 'llama3.2',
'messages' => [
['role' => 'user', 'content' => 'Summarise REST APIs in one paragraph.']
],
]);
echo $response->choices[0]->message->content;
Laravel Service Class
<?php
namespace App\Services;
use OpenAI\Client;
class AiService
{
public function __construct(private Client $client) {}
public function summarise(string $text, string $model = 'llama3.2'): string
{
$response = $this->client->chat()->create([
'model' => $model,
'messages' => [
['role' => 'user', 'content' => "Summarise in 3 bullet points:\n\n{$text}"]
],
'temperature' => 0.3,
]);
return $response->choices[0]->message->content;
}
public function classify(string $text): array
{
$response = $this->client->chat()->create([
'model' => 'llama3.2',
'messages' => [
['role' => 'user',
'content' => 'Classify as invoice/contract/email/other. Reply JSON only: {"category":"..."}\n\n'.substr($text,0,1000)]
],
'temperature' => 0,
]);
return json_decode($response->choices[0]->message->content, true);
}
}
// Register in AppServiceProvider
$this->app->singleton(AiService::class, fn() => new AiService(
OpenAI::factory()
->withApiKey(config('openai.api_key', 'ollama'))
->withBaseUri(config('openai.base_url', 'http://localhost:11434/v1'))
->make()
));
Streaming in Laravel
<?php
use Illuminate\Http\Response;
Route::get('/stream', function (AiService $ai) {
return response()->stream(function () use ($ai) {
$stream = $ai->client->chat()->createStreamed([
'model' => 'llama3.2',
'messages' => [['role' => 'user', 'content' => request('message')]],
]);
foreach ($stream as $response) {
$token = $response->choices[0]->delta->content ?? '';
echo "data: " . json_encode(['token' => $token]) . "\n\n";
ob_flush(); flush();
}
}, 200, ['Content-Type' => 'text/event-stream', 'Cache-Control' => 'no-cache']);
});
Embeddings
$response = $client->embeddings()->create([
'model' => 'nomic-embed-text',
'input' => 'The quick brown fox',
]);
$vector = $response->embeddings[0]->embedding;
echo 'Dimensions: ' . count($vector);
Laravel Queue Job
<?php
namespace App\Jobs;
use App\Models\Document;
use App\Services\AiService;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
class SummariseDocument implements ShouldQueue
{
use Queueable;
public int $timeout = 120;
public int $tries = 3;
public function __construct(private int $documentId) {}
public function handle(AiService $ai): void
{
$document = Document::findOrFail($this->documentId);
$summary = $ai->summarise($document->content);
$document->update(['summary' => $summary, 'summarised_at' => now()]);
}
}
// Dispatch
SummariseDocument::dispatch($document->id);
Why PHP for AI Integration
PHP runs more websites than any other server-side language, and a significant portion of those run Laravel or WordPress. Adding local AI capabilities to PHP applications via Ollama gives PHP developers access to the same local LLM features available to Python and Node.js teams — without learning a new language or runtime. The openai-php/client library’s OpenAI-compatible interface means the Ollama integration uses the same API patterns as a cloud OpenAI integration, and switching between them is a configuration change. For Laravel teams in particular, the combination of the openai-php/laravel package’s automatic service provider registration, Laravel’s queue system for background AI jobs, and Ollama’s local inference provides a complete, privacy-first AI stack that integrates naturally with existing Laravel infrastructure.
WordPress Integration
<?php
// Simple WordPress AI helper (no composer required — uses wp_remote_post)
function ollama_chat(string $message, string $model = 'llama3.2'): string|WP_Error {
$response = wp_remote_post('http://localhost:11434/api/chat', [
'headers' => ['Content-Type' => 'application/json'],
'body' => json_encode([
'model' => $model,
'messages' => [['role' => 'user', 'content' => $message]],
'stream' => false,
]),
'timeout' => 120,
]);
if (is_wp_error($response)) return $response;
$body = json_decode(wp_remote_retrieve_body($response), true);
return $body['message']['content'] ?? new WP_Error('ollama', 'No content in response');
}
// Usage in a plugin or theme
add_action('save_post', function(int $post_id) {
$post = get_post($post_id);
if ($post->post_status !== 'publish') return;
$summary = ollama_chat("Write a 2-sentence excerpt for: {$post->post_content}");
if (!is_wp_error($summary)) {
update_post_meta($post_id, '_ai_excerpt', $summary);
}
});
Laravel Configuration Best Practices
<?php
// config/ai.php
return [
'provider' => env('AI_PROVIDER', 'ollama'), // 'ollama' or 'openai'
'ollama' => [
'base_url' => env('OLLAMA_HOST', 'http://localhost:11434') . '/v1',
'api_key' => 'ollama',
'model' => env('OLLAMA_MODEL', 'llama3.2'),
'timeout' => (int) env('OLLAMA_TIMEOUT', 120),
],
'openai' => [
'base_url' => 'https://api.openai.com/v1',
'api_key' => env('OPENAI_API_KEY'),
'model' => env('OPENAI_MODEL', 'gpt-4o-mini'),
'timeout' => 30,
],
];
// AppServiceProvider: bind based on AI_PROVIDER env var
// Switch between Ollama and OpenAI with no code changes
Testing PHP AI Services
<?php
// Feature test using Laravel HTTP faking
public function test_summarise_returns_string(): void
{
Http::fake([
'localhost:11434/*' => Http::response([
'choices' => [['message' => ['content' => '• Point 1\n• Point 2']]]
])
]);
$service = app(AiService::class);
$result = $service->summarise('Test document text');
$this->assertStringContainsString('Point 1', $result);
}
Getting Started
Run composer require openai-php/laravel, add OPENAI_API_KEY=ollama and OPENAI_BASE_URL=http://localhost:11434/v1 to your .env file, pull a model with ollama pull llama3.2, and inject the OpenAI client into your service class. The openai-php/laravel package auto-registers the client in Laravel’s DI container. For WordPress without Composer, the wp_remote_post approach in this article requires no dependencies. Both integration paths work with the same Ollama endpoint, and the clean service class pattern makes adding caching, switching models, or writing tests straightforward from the first working feature.
PHP and Laravel integrate with Ollama through standard HTTP clients. The OpenAI PHP client library (configured for Ollama’s compatible endpoint) is the most convenient option, and Laravel’s HTTP client works equally well for direct API calls. This guide covers both approaches for adding local AI to PHP and Laravel applications.
Option 1: OpenAI PHP Client
composer require openai-php/client
use OpenAI\Client;
use OpenAI\Factory;
$client = OpenAI::factory()
->withApiKey('ollama')
->withBaseUri('http://localhost:11434/v1')
->make();
// Chat
$response = $client->chat()->create([
'model' => 'llama3.2',
'messages' => [
['role' => 'user', 'content' => 'Why is PHP still popular?']
],
]);
echo $response->choices[0]->message->content;
Laravel Service Provider
# config/ollama.php
return [
'host' => env('OLLAMA_HOST', 'http://localhost:11434'),
'model' => env('OLLAMA_MODEL', 'llama3.2'),
'timeout' => env('OLLAMA_TIMEOUT', 120),
];
// app/Services/AiService.php
namespace App\Services;
use Illuminate\Support\Facades\Http;
class AiService
{
private string $host;
private string $model;
private int $timeout;
public function __construct()
{
$this->host = config('ollama.host');
$this->model = config('ollama.model');
$this->timeout = config('ollama.timeout');
}
public function chat(string $message, ?string $model = null): string
{
$response = Http::timeout($this->timeout)
->post("{$this->host}/api/chat", [
'model' => $model ?? $this->model,
'messages' => [['role' => 'user', 'content' => $message]],
'stream' => false,
]);
return $response->json('message.content');
}
public function summarise(string $text): string
{
return $this->chat("Summarise in 3 bullet points:\n\n{$text}");
}
public function embed(string $text): array
{
$response = Http::timeout($this->timeout)
->post("{$this->host}/api/embeddings", [
'model' => 'nomic-embed-text',
'prompt' => $text,
]);
return $response->json('embedding');
}
}
Laravel Queue for Async Processing
// app/Jobs/SummariseDocument.php
namespace App\Jobs;
use App\Models\Document;
use App\Services\AiService;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
class SummariseDocument implements ShouldQueue
{
use Queueable;
public int $timeout = 180; // 3 minutes max
public function __construct(private Document $document) {}
public function handle(AiService $ai): void
{
$summary = $ai->summarise($this->document->content);
$this->document->update(['summary' => $summary]);
}
}
// Dispatch from controller:
SummariseDocument::dispatch($document);
Streaming Response in Laravel
// routes/web.php
Route::get('/ai/stream', function () {
return response()->stream(function () {
$host = config('ollama.host');
$client = new \GuzzleHttp\Client();
$response = $client->post("{$host}/api/chat", [
'json' => [
'model' => config('ollama.model'),
'messages' => [['role' => 'user', 'content' => request('message')]],
'stream' => true,
],
'stream' => true,
]);
$body = $response->getBody();
while (!$body->eof()) {
$line = $body->read(1024);
$data = json_decode($line, true);
if (isset($data['message']['content'])) {
echo "data: " . json_encode(['token' => $data['message']['content']]) . "\n\n";
ob_flush();
flush();
}
}
}, 200, [
'Content-Type' => 'text/event-stream',
'Cache-Control' => 'no-cache',
]);
});
Why PHP + Ollama Makes Sense
PHP powers a large fraction of the web — WordPress, Drupal, Magento, and countless custom Laravel applications are all potential candidates for local AI enhancement. The pattern of calling an HTTP API for AI features is not new to PHP — most PHP applications already call external APIs for payment processing, email, SMS, and analytics. Ollama is simply another HTTP endpoint, and PHP’s HTTP client ecosystem (Guzzle, Symfony HttpClient, Laravel’s Http facade) handles it seamlessly. The local processing advantage is particularly relevant for PHP applications handling user-submitted content where sending that content to OpenAI’s servers may be inappropriate — legal documents, medical forms, confidential customer data. Ollama processes everything locally, maintaining the data isolation expectations that many PHP enterprise applications require.
WordPress Integration
WordPress runs on PHP, and the same AiService pattern works in WordPress plugins:
// my-ai-plugin.php
function ai_summarise_post($post_content) {
$response = wp_remote_post('http://localhost:11434/api/chat', [
'headers' => ['Content-Type' => 'application/json'],
'body' => json_encode([
'model' => 'llama3.2',
'messages' => [['role' => 'user',
'content' => 'Summarise in 2 sentences: ' . substr($post_content, 0, 3000)]],
'stream' => false,
]),
'timeout' => 120,
]);
if (is_wp_error($response)) return '';
$body = json_decode(wp_remote_retrieve_body($response), true);
return $body['message']['content'] ?? '';
}
Caching with Laravel Cache
public function summariseCached(string $text): string
{
$key = 'ai:summary:' . md5($text . $this->model);
return Cache::remember($key, now()->addDay(), function () use ($text) {
return $this->summarise($text);
});
}
Testing PHP AI Services
// tests/Unit/AiServiceTest.php
use Tests\TestCase;
use Illuminate\Support\Facades\Http;
class AiServiceTest extends TestCase
{
public function test_summarise_returns_content(): void
{
Http::fake([
'*/api/chat' => Http::response([
'message' => ['content' => 'Summary bullet 1\nSummary bullet 2']
])
]);
$service = new \App\Services\AiService();
$result = $service->summarise('Long document text here...');
$this->assertStringContainsString('Summary', $result);
}
}
Getting Started
Install the openai-php/client package, create the AiService class from this article, bind it in your AppServiceProvider, and inject it into a controller or queue job. Test with php artisan tinker: app(App\Services\AiService::class)->chat('Hello'). Add the queue job for async processing and configure OLLAMA_HOST in your .env file for easy environment switching. The Laravel Http facade’s built-in fake support makes unit testing straightforward without requiring a running Ollama instance in CI. PHP’s broad deployment options — shared hosting, VPS, Docker, Laravel Forge, Laravel Vapor — all work with this pattern as long as the Ollama server is reachable from the PHP process.
Choosing Between Direct HTTP and the OpenAI PHP Client
Both the openai-php/client gem and Laravel’s built-in Http facade work well for Ollama integration. The openai-php/client is the better choice if: you want the same interface used in OpenAI tutorials and documentation (making it easy to adapt existing examples), you might switch between Ollama and OpenAI in the same application, or you want access to the full OpenAI API surface (embeddings, image generation, audio) with a consistent interface. The Laravel Http facade is the better choice if: you want minimal dependencies, you are building a simple integration without plans to swap backends, or you prefer staying within the Laravel ecosystem’s built-in tools. For most new Laravel AI projects, starting with the Http facade and migrating to the OpenAI client if needed is a reasonable approach — both produce identical API calls, so the migration is a refactor of the service class with no behavioural changes.
PHP AI in 2026
PHP’s position in the AI ecosystem has improved significantly since dedicated client libraries (openai-php/client, prism-php for multi-provider AI) emerged alongside Laravel’s opinionated tooling for AI features. Teams building AI-powered PHP applications in 2026 have a mature set of tools: HTTP clients for model access, queue systems for async processing, cache layers for response memoisation, and testing utilities for mocking AI responses in unit tests. The combination of Ollama’s local inference and PHP’s mature web framework ecosystem makes local AI integration straightforward for the vast installed base of PHP applications — from small Laravel projects to large enterprise WordPress deployments. The patterns in this article apply across all of these contexts with minimal adaptation.
PHP Configuration for Production
In production, several PHP configuration settings affect Ollama integration quality. max_execution_time (default 30 seconds) must be increased for synchronous inference — set it to at least 180 seconds for 7B models. default_socket_timeout affects Guzzle and cURL connections and must also be extended. For Laravel queue workers processing AI jobs, set the queue timeout higher than the inference timeout to prevent workers from killing jobs mid-generation. A production checklist: verify max_execution_time and request_terminate_timeout (PHP-FPM) allow long-running requests, configure the queue job $timeout property to match expected inference time, test timeout behaviour explicitly with a deliberately slow model call, and monitor queue job duration via Laravel Horizon or telescope. These configuration details are where most PHP + Ollama integrations run into trouble — addressing them proactively saves significant debugging time in production.
Structured Output in PHP
public function classify(string $text): array
{
$schema = [
'type' => 'object',
'properties' => [
'category' => ['type' => 'string',
'enum' => ['invoice', 'contract', 'report', 'email', 'other']],
'confidence' => ['type' => 'string',
'enum' => ['high', 'medium', 'low']],
],
'required' => ['category', 'confidence'],
];
$response = Http::timeout($this->timeout)
->post("{$this->host}/api/chat", [
'model' => $this->model,
'messages' => [['role' => 'user',
'content' => 'Classify this document: ' . substr($text, 0, 1000)]],
'format' => $schema,
'stream' => false,
'options' => ['temperature' => 0],
]);
return json_decode($response->json('message.content'), true);
}
Performance Tuning for PHP + Ollama
PHP’s synchronous execution model means inference calls block the PHP process for their full duration. With PHP-FPM, this is manageable — FPM spawns multiple worker processes, so blocked workers do not prevent other requests from being served. The key configuration parameters: pm.max_children controls how many concurrent PHP-FPM workers can run, and with long-running AI requests you need enough workers that AI calls do not exhaust the pool. For an application where 10% of requests trigger 30-second AI calls, you need far more FPM workers than a pure web application would. Monitor FPM’s listen.backlog and slow request logs to detect worker pool saturation. The background job pattern (Laravel Queue) solves this architectural problem definitively — AI calls run in separate worker processes, leaving FPM workers free to serve web requests without being blocked by inference latency.
Getting the Most from PHP + Ollama
The combination of PHP’s dominant position in web hosting, Laravel’s excellent developer experience, and Ollama’s local inference provides a practical path to AI features for the vast majority of web applications that still run on PHP. The integration is not glamorous — HTTP calls, queue jobs, cache layers — but it is reliable, testable, and easy for PHP developers to understand and maintain without learning new paradigms. Pull a model, write the service class, dispatch a job, and your PHP application has local AI capabilities within an afternoon. That low barrier to entry, combined with the privacy and cost advantages of local inference, makes Ollama the natural first choice for any PHP team evaluating AI integration options.
Beyond Laravel: Symfony and Slim
The same integration patterns apply in Symfony and Slim applications. Symfony’s HttpClient component makes the API calls with identical request/response structures; Symfony Messenger handles the async queue pattern. Slim applications (common for lightweight APIs) use Guzzle or any PSR-18 HTTP client to call Ollama directly. The AiService class can be made framework-agnostic by depending on a PSR-18 HTTP client interface rather than Laravel’s Http facade — this makes it portable across all PSR-compliant PHP frameworks. For WordPress, the native wp_remote_post function handles the HTTP call without any additional dependencies. The common thread across all PHP environments is the same simple JSON over HTTP pattern — Ollama’s API is deliberately straightforward to integrate with any HTTP client, in any language or framework, which is one of its most practical architectural strengths.
The PHP ecosystem’s broad reach means that Ollama-powered AI features built with the patterns in this article can be deployed across the widest possible range of hosting environments — from shared hosting running WordPress to enterprise Laravel deployments on Kubernetes — with minimal adaptation. That portability, combined with the zero-API-cost and privacy advantages of local inference, makes the Ollama + PHP combination one of the most pragmatic and widely-applicable local AI integration options available to web developers today — a combination that will remain practical and relevant for the foreseeable future regardless of how the broader AI landscape evolves.
Structured Output in PHP
For AI tasks that need typed results rather than free-form text, pass a JSON schema in the format field to get a parseable response every time:
public function classifyDocument(string $text): array
{
$schema = [
'type' => 'object',
'properties' => [
'category' => ['type' => 'string',
'enum' => ['invoice','contract','report','email','other']],
'confidence' => ['type' => 'string', 'enum' => ['high','medium','low']],
],
'required' => ['category','confidence'],
];
$response = Http::timeout($this->timeout)->post("{$this->host}/api/chat", [
'model' => $this->model,
'messages' => [['role'=>'user','content'=>'Classify: '.substr($text,0,1000)]],
'format' => $schema,
'stream' => false,
'options' => ['temperature' => 0],
]);
return json_decode($response->json('message.content'), true);
}
PHP + Ollama: Summary
The combination of PHP’s ubiquity, Laravel’s developer experience, and Ollama’s local inference creates a practical, low-friction path to AI features for the enormous installed base of PHP web applications. The integration requires no new infrastructure beyond Ollama itself, follows familiar Laravel patterns for service classes and background jobs, and works across the full range of PHP hosting environments. For teams already running PHP applications who want to add AI capabilities without cloud API costs or data privacy concerns, this is the stack to reach for first.
From Prototype to Production in PHP
The progression from a working prototype to a production-ready PHP AI integration is well-defined: start with synchronous service calls in a controller to verify quality, move inference to a queue job to unblock web workers, add caching to avoid redundant calls for repeated inputs, add structured output for tasks that need typed responses, and instrument with logging to monitor usage and latency. Each step is incremental and follows Laravel conventions that PHP developers already know. The local Ollama backend means the cost per request is compute time rather than API billing, which changes the economics significantly for high-volume use cases where cloud AI would become expensive at scale.
One practical detail worth noting: PHP’s synchronous execution model means you should always set max_execution_time generously for inference endpoints — 120 seconds is a reasonable floor for local models, and PHP-FPM’s request_terminate_timeout should match or exceed it to avoid the pool killing a long-running inference request mid-stream. With those limits in place, local Ollama inference in PHP behaves reliably under production load.