SSE
Streaming
Get real-time token-by-token responses using Server-Sent Events (SSE).
How It Works
Set stream: true in your request. The server responds with text/event-stream. Each SSE event contains a delta of new tokens as they are generated.
SSE Chunk Format
text/event-stream
1
2
3
4
5
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"Hello"}}]}
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":" world"}}]}
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"!"},"finish_reason":"stop"}],"usage":{"total_tokens":75}}
data: [DONE]Examples
Pythonpython
from openai import OpenAI
client = OpenAI(api_key="KEY", base_url="https://api.aipilotads.com/v1")
stream = client.chat.completions.create(
model="gpt-5.5",
messages=[{"role": "user", "content": "Write a poem"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")JavaScriptjavascript
import OpenAI from 'openai';
const client = new OpenAI({ apiKey: 'KEY', baseURL: 'https://api.aipilotads.com/v1' });
const stream = await client.chat.completions.create({
model: 'gpt-5.5', messages: [{role:'user',content:'Write a poem'}], stream: true
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}Works with both endpoints
Streaming is supported on /v1/chat/completions and /v1/messages.