AkashML

Documentation

Introduction
Claude Code
List models (Anthropic shape) GETAnthropic base health probe GETCreate a message (Anthropic shape) POST
Anthropic SDK
List models GET
Create chat completion POSTCreate completion POST
Platform
Parameter controlsPresetsInferences
Models
Settings
API Reference

Anthropic SDK

AkashML exposes an Anthropic-shaped Messages API over its open source model catalog. You can point any Anthropic-compatible client (including the official anthropic SDK and Claude Code) at AkashML by setting a custom base URL.

Base URL and authentication

The Anthropic-compatible endpoints are served from:

https://api.akashml.com/anthropic

Authentication uses the same Bearer-token scheme as the OpenAI-compatible API. Pass your AkashML API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Create and manage keys under Settings → API Keys.

Model IDs and the -- alias

Claude Code rejects model identifiers that contain /, so AkashML aliases slashes with -- on the Anthropic endpoints. To target an upstream model whose ID is MiniMaxAI/MiniMax-M2.5, request the alias MiniMaxAI--MiniMax-M2.5:

Upstream IDAnthropic-endpoint alias
MiniMaxAI/MiniMax-M2.5MiniMaxAI--MiniMax-M2.5
meta-llama/Llama-3.3-70B-Instructmeta-llama--Llama-3.3-70B-Instruct

The OpenAI-compatible endpoints (/v1/*) continue to accept slashed IDs unchanged.

You can list the aliased models from GET /anthropic/v1/models.

Sending a message

The Messages endpoint is POST /anthropic/v1/messages. Required fields: model, messages, and max_tokens.

curl
curl https://api.akashml.com/anthropic/v1/messages \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "MiniMaxAI--MiniMax-M2.5",
    "max_tokens": 256,
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'

Using the official Anthropic Python SDK:

Python (anthropic SDK)
from anthropic import Anthropic

client = Anthropic(
    api_key="YOUR_API_KEY",
    base_url="https://api.akashml.com/anthropic",
)

response = client.messages.create(
    model="MiniMaxAI--MiniMax-M2.5",
    max_tokens=256,
    messages=[
        {"role": "user", "content": "Hello!"},
    ],
)

print(response.content[0].text)

Streaming

Set stream: true to receive Server-Sent Events. The response content type switches to text/event-stream and Anthropic-shaped events are emitted:

EventPurpose
message_startInitial message envelope with id, model, role, usage.
content_block_startStart of a content block (text, tool_use, or thinking).
content_block_deltaIncremental content (text delta, input_json_delta, thinking_delta).
content_block_stopEnd of a content block.
message_deltaRunning stop_reason and usage updates.
message_stopTerminal event.
pingKeep-alive.
errorTerminal error event.

Thinking mode

Models that support extended reasoning accept a thinking block:

{
  "thinking": {
    "type": "enabled",
    "budget_tokens": 1024
  }
}

When enabled, the response includes thinking content blocks before the final text block.

Errors

All non-2xx responses use { type: "error", error: { type, message } }. Status-to-error.type mapping:

Statuserror.type
400invalid_request_error
401authentication_error
403permission_error
404not_found_error
413request_too_large
429rate_limit_error (response includes a Retry-After header)
500api_error
503 / 529overloaded_error

Create a message (Anthropic shape) POST

Anthropic-compatible Messages endpoint. Internally translates to the upstream OpenAI-compatible inference path. **Streaming.** When `stream: true` is set, the response content type switches to `text/event-stream` and Anthropic-shaped events are emitted. Event types: - `message_start` — message envelope with id, model, role, usage. - `content_block_start` — start of a content block (text, tool_use, thinking). - `content_block_delta` — incremental content (text delta, input_json_delta, thinking_delta). - `content_block_stop` — end of a content block. - `message_delta` — running stop_reason / usage updates. - `message_stop` — terminal event. - `ping` — keep-alive. - `error` — terminal error event (see error type mapping below). **Model ID aliasing.** Slashes in upstream model IDs are aliased with `--` — pass `anthropic--claude-3-5-sonnet` to target `anthropic/claude-3-5-sonnet`. **Error response shape.** All non-2xx responses use `{ type: "error", error: { type, message } }`. Status → `error.type` mapping: | Status | `error.type` | |--------|---------------| | `400` | `invalid_request_error` | | `401` | `authentication_error` | | `403` | `permission_error` | | `404` | `not_found_error` | | `413` | `request_too_large` | | `429` | `rate_limit_error` (response includes `Retry-After` header) | | `500` | `api_error` | | `503` / `529` | `overloaded_error` |

List models GET

Lists the currently available models with pricing and capability metadata.

On this page

Anthropic SDKBase URL and authenticationModel IDs and the -- aliasSending a messageStreamingThinking modeErrors