SSE

Streaming

Get real-time token-by-token responses using Server-Sent Events (SSE).

How It Works

Set stream: true in your request. The server responds with text/event-stream. Each SSE event contains a delta of new tokens as they are generated.

SSE Chunk Format

text/event-stream
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"Hello"}}]}

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":" world"}}]}

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"!"},"finish_reason":"stop"}],"usage":{"total_tokens":75}}

data: [DONE]

Examples

Pythonpython
from openai import OpenAI
client = OpenAI(api_key="KEY", base_url="https://api.aipilotads.com/v1")

stream = client.chat.completions.create(
    model="gpt-5.5",
    messages=[{"role": "user", "content": "Write a poem"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
JavaScriptjavascript
import OpenAI from 'openai';
const client = new OpenAI({ apiKey: 'KEY', baseURL: 'https://api.aipilotads.com/v1' });
const stream = await client.chat.completions.create({
  model: 'gpt-5.5', messages: [{role:'user',content:'Write a poem'}], stream: true
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Works with both endpoints

Streaming is supported on /v1/chat/completions and /v1/messages.