Process high volumes of LLM requests efficiently using batch operations. Batch processing is ideal for offline workloads where immediate responses aren't required.
Batch processing allows you to:
- Submit multiple requests at once
- Process them asynchronously
- Retrieve results later
- Save costs (often 50% cheaper than real-time)
- Handle large-scale operations
Note: Batch processing support varies by provider. Both AnthropicClient and OpenAIClient implement batches; Gemini does not.
Clients implementing batch operations use the LLMBatchClient interface:
<?php
use Soukicz\Llm\Client\LLMBatchClient;
interface LLMBatchClient {
public function createBatch(array $requests): string;
public function retrieveBatch(string $batchId): ?array;
public function getCode(): string;
}createBatch()takes an array ofLLMRequestobjects keyed by your custom ID and returns the provider's batch ID.retrieveBatch()returnsnullwhile the batch is still in progress. Once finished, it returns an array mapping each custom ID to the response text content.
<?php
use Soukicz\Llm\Client\OpenAI\OpenAIClient;
use Soukicz\Llm\Client\OpenAI\Model\GPT5;
use Soukicz\Llm\LLMConversation;
use Soukicz\Llm\LLMRequest;
use Soukicz\Llm\Message\LLMMessage;
/** @var LLMBatchClient $client */
$client = new OpenAIClient('sk-xxxxx', 'org-xxxxx');
// Prepare multiple requests, keyed by a custom ID of your choice
$requests = [];
for ($i = 0; $i < 1000; $i++) {
$requests["document-$i"] = new LLMRequest(
model: new GPT5(GPT5::VERSION_2025_08_07),
conversation: new LLMConversation([
LLMMessage::createFromUserString("Summarize document $i")
])
);
}
// Submit batch
$batchId = $client->createBatch($requests);
echo "Batch created: $batchId\n";<?php
// Returns null while the batch is in progress,
// or an array of [custom ID => response text] when finished
$results = $client->retrieveBatch($batchId);
if ($results !== null) {
foreach ($results as $customId => $text) {
echo "$customId: $text\n";
}
}<?php
use Soukicz\Llm\Client\OpenAI\OpenAIClient;
use Soukicz\Llm\Client\OpenAI\Model\GPT5;
use Soukicz\Llm\LLMConversation;
use Soukicz\Llm\LLMRequest;
use Soukicz\Llm\Message\LLMMessage;
$client = new OpenAIClient('sk-xxxxx', 'org-xxxxx');
// Prepare batch of classification tasks
$texts = [
'review-1' => 'This product is amazing!',
'review-2' => 'Terrible service, would not recommend.',
'review-3' => 'It\'s okay, nothing special.',
// ... 1000s more
];
$requests = array_map(
fn($text) => new LLMRequest(
model: new GPT5(GPT5::VERSION_2025_08_07),
conversation: new LLMConversation([
LLMMessage::createFromUserString("Classify sentiment (positive/negative/neutral): $text")
])
),
$texts
); // array_map preserves the string keys, which become custom IDs
// Submit batch
$batchId = $client->createBatch($requests);
// Poll until complete
do {
sleep(60); // Wait 1 minute
$results = $client->retrieveBatch($batchId);
} while ($results === null);
// Process batch results: custom ID => response text
foreach ($results as $customId => $text) {
echo "$customId: $text\n";
}The same code works with AnthropicClient — only the client and model classes change.
Use async operations for efficient polling:
<?php
use React\EventLoop\Loop;
$batchId = $client->createBatch($requests);
// Check every 60 seconds
Loop::addPeriodicTimer(60, function () use ($client, $batchId, &$timer) {
$batch = $client->retrieveBatch($batchId);
if ($batch !== null) {
// Batch is available, process results
processResults($batch);
Loop::cancelTimer($timer);
}
});
Loop::run();Process large datasets:
<?php
// Classify 100k customer reviews
$reviews = loadReviews(); // 100,000 reviews
$batches = array_chunk($reviews, 1000); // Batch size of 1000
foreach ($batches as $batchReviews) {
$requests = array_map(
fn($review) => createClassificationRequest($review),
$batchReviews
);
$batchIds[] = $client->createBatch($requests);
}
// Wait for all batches to complete
waitForBatches($batchIds);Generate content at scale:
<?php
// Generate product descriptions for 10k products
$products = loadProducts();
$requests = array_map(
fn($product) => new LLMRequest(
model: $model,
conversation: new LLMConversation([
LLMMessage::createFromUserString("Write a compelling product description for: {$product->name}")
])
),
$products
);
$batchId = $client->createBatch($requests);Batch translate documents:
<?php
// Translate 1000 documents to 5 languages
$documents = loadDocuments();
$languages = ['es', 'fr', 'de', 'it', 'pt'];
$requests = [];
foreach ($documents as $doc) {
foreach ($languages as $lang) {
$requests[] = new LLMRequest(
model: $model,
conversation: new LLMConversation([
LLMMessage::createFromUserString("Translate to $lang: {$doc->content}")
])
);
}
}
$batchId = $client->createBatch($requests);- Batch sizing - Keep batches at 1000-10000 requests for optimal processing
- Polling interval - Poll every 60-300 seconds, not more frequently
- Error handling - Handle failed batches gracefully
- Cost monitoring - Track batch costs across operations
- Result storage - Save results immediately after retrieval
- Timeout handling - Set reasonable timeouts for batch completion
- Rate limits - Respect provider rate limits on batch creation
There is no dedicated batch exception class. Batch creation fails with a Guzzle HTTP exception (e.g. GuzzleHttp\Exception\ClientException), and retrieveBatch() throws a \RuntimeException when the batch itself failed (e.g. OpenAI produced only an error file) or returned an unexpected status:
<?php
use GuzzleHttp\Exception\GuzzleException;
try {
$batchId = $client->createBatch($requests);
} catch (GuzzleException $e) {
// Handle batch creation error (invalid request, rate limit, ...)
echo "Failed to create batch: " . $e->getMessage();
// Retry with smaller batch size
$smallerBatches = array_chunk($requests, 500, preserve_keys: true);
foreach ($smallerBatches as $batch) {
$batchIds[] = $client->createBatch($batch);
}
}
// Retrieve batch results
try {
$results = $client->retrieveBatch($batchId);
if ($results !== null) {
processResults($results); // custom ID => response text
}
} catch (\RuntimeException $e) {
echo "Batch failed: " . $e->getMessage();
}Batch processing typically offers 50% cost savings:
<?php
// Real-time: $0.01 per request × 10,000 = $100
$realTimeCost = 10000 * 0.01;
// Batch: $0.005 per request × 10,000 = $50
$batchCost = 10000 * 0.005;
echo "Savings: $" . ($realTimeCost - $batchCost); // $50- ✅ OpenAI -
OpenAIClientimplementsLLMBatchClient(uploads a JSONL file via/filesand creates a/batchesjob with a 24h completion window) - ✅ Anthropic -
AnthropicClientimplementsLLMBatchClient(uses the/v1/messages/batchesAPI) - ❌ Google Gemini - Not supported by
GeminiClient ⚠️ OpenAI-compatible -OpenAICompatibleClientinherits the batch methods, but the provider must support the OpenAI files and batches endpoints
- Custom IDs are the array keys you pass to
createBatch()and they identify each result returned byretrieveBatch(). retrieveBatch()extracts only the text content of each response. Tool calls, structured output and other content types are not decoded.- For OpenAI, when a completed batch produced only an error file,
retrieveBatch()throws a\RuntimeExceptionwith the error details (or returns an empty array when the batch is older than 3 days).
- Latency - Results may take minutes to hours
- No streaming - Batch responses don't support streaming
- No cancellation - Some providers don't allow batch cancellation
- Result expiration - Results may expire after 24-48 hours
- Size limits - Maximum batch size varies by provider
- Configuration Guide - Request configuration
- Provider Documentation - Provider-specific batch features