GPT Realtime 2 API

Use a user API key to call the GPT Realtime 2 voice APIs.

Overview

Users can create an API key in Settings -> API Keys and call the Realtime endpoints from their own server or trusted client.

Send the key with either header:

Authorization: Bearer sk-gptr2-xxx

or:

x-api-key: sk-gptr2-xxx

The API key belongs to the user who created it. Usage is deducted from that user's credit balance.

Models

UI modelAPI model valueEndpointTransportCredit burn
Starter Voicestarter-voicePOST /api/realtime2/pipelineSTT + LLM + TTS pipeline1 unit/sec
GPT Realtime Minirealtime-miniPOST /api/realtime2/session and POST /api/realtime2/chargeOpenAI Realtime WebRTC4 units/sec
GPT Realtime 2realtime-proPOST /api/realtime2/session and POST /api/realtime2/chargeOpenAI Realtime WebRTC16 units/sec

Voices currently supported by all three models:

marin, cedar, coral, sage, ash, verse

Starter Voice

Starter Voice is the easiest API call. Send text or an audio file with multipart/form-data.

Text input

curl "$APP_URL/api/realtime2/pipeline" \
  -H "Authorization: Bearer $GPTREALTIME2_API_KEY" \
  -F "model=starter-voice" \
  -F "voice=marin" \
  -F "text=Tell me one useful idea in 20 seconds." \
  -F "instructions=You are concise and natural."

Response:

{
  "transcript": "",
  "assistantText": "One useful idea is to write down the next physical action...",
  "audioBase64": "...",
  "audioContentType": "audio/mpeg",
  "model": "gpt-4o-mini",
  "modelId": "starter-voice",
  "voice": "marin"
}

Audio input

curl "$APP_URL/api/realtime2/pipeline" \
  -H "Authorization: Bearer $GPTREALTIME2_API_KEY" \
  -F "model=starter-voice" \
  -F "voice=marin" \
  -F "durationSeconds=8" \
  -F "audio=@./voice.webm;type=audio/webm" \
  -F "instructions=Answer briefly."

Optional fields:

FieldDescription
textUser text. Use this instead of audio, or as fallback.
audioUser audio file. Max 25 MB.
durationSecondsInput audio duration, used for credit estimation.
messagesJSON string of recent messages, for example [{"role":"user","text":"Hi"}].
instructionsSystem behavior instructions. Max 4,000 characters.
voiceOne of the supported voices. Defaults to marin.

GPT Realtime Mini

Realtime Mini uses WebRTC. Your client creates an SDP offer, sends it to the session endpoint, receives an SDP answer, then sets it as the remote description.

const appUrl = process.env.APP_URL!;
const apiKey = process.env.GPTREALTIME2_API_KEY!;

const pc = new RTCPeerConnection();
pc.addTransceiver('audio', { direction: 'sendrecv' });

const offer = await pc.createOffer();
await pc.setLocalDescription(offer);

const response = await fetch(`${appUrl}/api/realtime2/session`, {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${apiKey}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'realtime-mini',
    voice: 'marin',
    instructions: 'You are concise and natural.',
    sdp: offer.sdp,
  }),
});

if (!response.ok) {
  throw new Error(await response.text());
}

const answerSdp = await response.text();
const callId = response.headers.get('x-openai-call-id') || '';
await pc.setRemoteDescription({ type: 'answer', sdp: answerSdp });

After a realtime response is completed, call the charge endpoint:

curl "$APP_URL/api/realtime2/charge" \
  -H "Authorization: Bearer $GPTREALTIME2_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"realtime-mini","callId":"call_abc123"}'

The current implementation charges the minimum realtime window of 120 seconds per charge call.

GPT Realtime 2

GPT Realtime 2 uses the same WebRTC flow as Mini. Change only the model value:

const response = await fetch(`${appUrl}/api/realtime2/session`, {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${apiKey}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'realtime-pro',
    voice: 'marin',
    instructions: 'You are a premium realtime voice assistant.',
    sdp: offer.sdp,
  }),
});

Charge the same way:

curl "$APP_URL/api/realtime2/charge" \
  -H "Authorization: Bearer $GPTREALTIME2_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"realtime-pro","callId":"call_abc123"}'

Errors

Common error responses:

HTTPCodeMeaning
401invalid_api_keyThe key is wrong or deleted.
401unauthorizedNo valid login session and no API key.
402insufficient_creditsThe user does not have enough credits.
400unsupported_modelThe endpoint does not support the selected model.
400invalid_sdpRealtime Mini/Pro require a valid SDP offer.