NexLLM API Reference: Endpoints and Base URL Guide

NexLLM provides a unified, OpenAI-compatible REST API that lets you access GPT, Claude, and Gemini models through a single base URL. You can drop NexLLM into any project already using the OpenAI SDK — just swap the base URL and your NexLLM API key, and your existing code keeps working without any other changes.

Base URL

Every request you make goes to the following base URL:

https://www.nexllm.ai/v1

To use NexLLM with the OpenAI SDK, set base_url="https://www.nexllm.ai/v1" and pass your NexLLM API key as api_key. No other configuration changes are required.

Supported Endpoints

The table below lists every endpoint NexLLM exposes. All paths are relative to the base URL above.

Endpoint	Method & Path	Description
Chat Completions	`POST /v1/chat/completions`	Generate conversational responses, with streaming support
Text Completions	`POST /v1/completions`	Legacy text completion for prompt-based generation
Embeddings	`POST /v1/embeddings`	Convert text to vector embeddings
Image Generation	`POST /v1/images/generations`	Generate images from text prompts
Image Editing	`POST /v1/images/edits`	Edit or modify images
Speech-to-Text	`POST /v1/audio/transcriptions`	Transcribe audio to text
Text-to-Speech	`POST /v1/audio/speech`	Convert text to spoken audio
Rerank	`POST /v1/rerank`	Rerank documents by relevance
Responses API	`POST /v1/responses`	OpenAI Responses API-compatible format
Realtime	`GET /v1/realtime` (WebSocket)	Real-time interactions
Models	`GET /v1/models`	List available models
Claude Messages	`POST /v1/messages`	Claude native format (Anthropic Messages API)

Native Format Support

In addition to the OpenAI-compatible endpoints above, NexLLM also supports provider-native request formats:

Claude native format — Send requests to POST /v1/messages using the x-api-key: sk-... header and anthropic-version: 2023-06-01 header. This lets you use the Anthropic Messages API format directly, without converting to OpenAI-style payloads.

Gemini native format — Send requests to GET /v1beta/models/{model}:generateContent?key=sk-... using the standard Gemini query-parameter authentication. Replace {model} with the model name (e.g. gemini-2.5-flash).

Explore the API

Authentication

Learn how to authenticate your requests using the Authorization header or native provider headers.

Chat Completions

Generate conversational AI responses with full streaming support.

Embeddings

Convert text into vector embeddings for semantic search and RAG pipelines.

Image Generation

Generate and edit images from natural language prompts.

Audio

Transcribe audio files to text or synthesize speech from text.

Models

Retrieve the full list of models available to your API key.

​Base URL

​Supported Endpoints

​Native Format Support

​Explore the API

Authentication

Chat Completions

Embeddings

Image Generation

Audio

Models

Base URL

Supported Endpoints

Native Format Support

Explore the API