Create completion

Creates a completion for the provided prompt. Supports streaming via SSE when stream: true.

Response headers

Inference-Id — Unique ID for this request. Include this when contacting support.

Error codes

Status	Meaning
`402`	Insufficient credits
`429`	Rate limited
`504`	No backend available
`529`	No healthy backends are available for the requested model

Authorization

BearerAuth

AuthorizationBearer <token>

API key passed as Bearer token

In: header

Request Body

application/json

TypeScript Definitions

Use the request body type in TypeScript.

Response Body

`application/json`

curl -X POST "https://example.com/v1/completions" \  -H "Content-Type: application/json" \  -d '{    "model": "string",    "prompt": "string"  }'

{
  "id": "string",
  "object": "string",
  "created": 0,
  "model": "string",
  "choices": [
    {
      "index": 0,
      "text": "string",
      "finish_reason": "string"
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

Creates a completion for the provided prompt. Supports streaming via SSE when stream: true.

Response headers

Inference-Id — Unique ID for this request. Include this when contacting support.

Error codes

Status	Meaning
`402`	Insufficient credits
`429`	Rate limited
`504`	No backend available
`529`	No healthy backends are available for the requested model

Authorization

BearerAuth

AuthorizationBearer <token>

API key passed as Bearer token

In: header

Request Body

application/json

TypeScript Definitions

Use the request body type in TypeScript.

model*string

The model to use for inference.

prompt*|array<any>

The prompt to complete. Can be a string, array of strings, or array of token IDs.

temperature?number

Sampling temperature. Higher values produce more random output.

max_tokens?integer

Maximum number of tokens to generate.

top_p?number

Nucleus sampling probability mass threshold.

frequency_penalty?number

Penalizes repeated tokens based on frequency.

presence_penalty?number

Penalizes tokens that have already appeared.

stop?|array<string>

Sequence(s) at which generation stops.

stream?boolean

Stream the response as SSE.

n?integer

Number of completions to generate. Defaults to 1.

user?string

End-user identifier for abuse monitoring.

[key: string]?any

Response Body

`application/json`

curl -X POST "https://example.com/v1/completions" \  -H "Content-Type: application/json" \  -d '{    "model": "string",    "prompt": "string"  }'

{
  "id": "string",
  "object": "string",
  "created": 0,
  "model": "string",
  "choices": [
    {
      "index": 0,
      "text": "string",
      "finish_reason": "string"
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

Documentation

Create completion

Authorization

Request Body

Response Body

`application/json`

Documentation

Create completion

Authorization

Request Body

Response Body

`application/json`

Documentation

Create completion

Authorization

Request Body

Response Body

200application/json

Documentation

Create completion

Authorization

Request Body

Response Body

200application/json

`application/json`

`application/json`