GPT Realtime 2 API

Overview

Users can create an API key in Settings -> API Keys and call the Realtime endpoints from their own server or trusted client.

Send the key with either header:

Authorization: Bearer sk-gptr2-xxx

or:

x-api-key: sk-gptr2-xxx

The API key belongs to the user who created it. Usage is deducted from that user's credit balance.

Models

UI model	API `model` value	Endpoint	Transport	Credit burn
Starter Voice	`starter-voice`	`POST /api/realtime2/pipeline`	STT + LLM + TTS pipeline	`1 unit/sec`
GPT Realtime Mini	`realtime-mini`	`POST /api/realtime2/session` and `POST /api/realtime2/charge`	OpenAI Realtime WebRTC	`4 units/sec`
GPT Realtime 2	`realtime-pro`	`POST /api/realtime2/session` and `POST /api/realtime2/charge`	OpenAI Realtime WebRTC	`16 units/sec`

Voices currently supported by all three models:

marin, cedar, coral, sage, ash, verse

Starter Voice

Starter Voice is the easiest API call. Send text or an audio file with multipart/form-data.

Text input

curl "$APP_URL/api/realtime2/pipeline" \
  -H "Authorization: Bearer $GPTREALTIME2_API_KEY" \
  -F "model=starter-voice" \
  -F "voice=marin" \
  -F "text=Tell me one useful idea in 20 seconds." \
  -F "instructions=You are concise and natural."

Response:

{
  "transcript": "",
  "assistantText": "One useful idea is to write down the next physical action...",
  "audioBase64": "...",
  "audioContentType": "audio/mpeg",
  "model": "gpt-4o-mini",
  "modelId": "starter-voice",
  "voice": "marin"
}

Audio input

curl "$APP_URL/api/realtime2/pipeline" \
  -H "Authorization: Bearer $GPTREALTIME2_API_KEY" \
  -F "model=starter-voice" \
  -F "voice=marin" \
  -F "durationSeconds=8" \
  -F "audio=@./voice.webm;type=audio/webm" \
  -F "instructions=Answer briefly."

Optional fields:

Field	Description
`text`	User text. Use this instead of `audio`, or as fallback.
`audio`	User audio file. Max 25 MB.
`durationSeconds`	Input audio duration, used for credit estimation.
`messages`	JSON string of recent messages, for example `[{"role":"user","text":"Hi"}]`.
`instructions`	System behavior instructions. Max 4,000 characters.
`voice`	One of the supported voices. Defaults to `marin`.

GPT Realtime Mini

Realtime Mini uses WebRTC. Your client creates an SDP offer, sends it to the session endpoint, receives an SDP answer, then sets it as the remote description.

const appUrl = process.env.APP_URL!;
const apiKey = process.env.GPTREALTIME2_API_KEY!;

const pc = new RTCPeerConnection();
pc.addTransceiver('audio', { direction: 'sendrecv' });

const offer = await pc.createOffer();
await pc.setLocalDescription(offer);

const response = await fetch(`${appUrl}/api/realtime2/session`, {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${apiKey}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'realtime-mini',
    voice: 'marin',
    instructions: 'You are concise and natural.',
    sdp: offer.sdp,
  }),
});

if (!response.ok) {
  throw new Error(await response.text());
}

const answerSdp = await response.text();
const callId = response.headers.get('x-openai-call-id') || '';
await pc.setRemoteDescription({ type: 'answer', sdp: answerSdp });

After a realtime response is completed, call the charge endpoint:

curl "$APP_URL/api/realtime2/charge" \
  -H "Authorization: Bearer $GPTREALTIME2_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"realtime-mini","callId":"call_abc123"}'

The current implementation charges the minimum realtime window of 120 seconds per charge call.

GPT Realtime 2

GPT Realtime 2 uses the same WebRTC flow as Mini. Change only the model value:

const response = await fetch(`${appUrl}/api/realtime2/session`, {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${apiKey}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'realtime-pro',
    voice: 'marin',
    instructions: 'You are a premium realtime voice assistant.',
    sdp: offer.sdp,
  }),
});

Charge the same way:

curl "$APP_URL/api/realtime2/charge" \
  -H "Authorization: Bearer $GPTREALTIME2_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"realtime-pro","callId":"call_abc123"}'

Errors

Common error responses:

HTTP	Code	Meaning
`401`	`invalid_api_key`	The key is wrong or deleted.
`401`	`unauthorized`	No valid login session and no API key.
`402`	`insufficient_credits`	The user does not have enough credits.
`400`	`unsupported_model`	The endpoint does not support the selected model.
`400`	`invalid_sdp`	Realtime Mini/Pro require a valid SDP offer.

GPT Realtime 2 API

On this page