Skip to main content
NexLLM provides a unified, OpenAI-compatible REST API that lets you access GPT, Claude, and Gemini models through a single base URL. You can drop NexLLM into any project already using the OpenAI SDK — just swap the base URL and your NexLLM API key, and your existing code keeps working without any other changes.

Base URL

Every request you make goes to the following base URL:
https://www.nexllm.ai/v1
To use NexLLM with the OpenAI SDK, set base_url="https://www.nexllm.ai/v1" and pass your NexLLM API key as api_key. No other configuration changes are required.

Supported Endpoints

The table below lists every endpoint NexLLM exposes. All paths are relative to the base URL above.
EndpointMethod & PathDescription
Chat CompletionsPOST /v1/chat/completionsGenerate conversational responses, with streaming support
Text CompletionsPOST /v1/completionsLegacy text completion for prompt-based generation
EmbeddingsPOST /v1/embeddingsConvert text to vector embeddings
Image GenerationPOST /v1/images/generationsGenerate images from text prompts
Image EditingPOST /v1/images/editsEdit or modify images
Speech-to-TextPOST /v1/audio/transcriptionsTranscribe audio to text
Text-to-SpeechPOST /v1/audio/speechConvert text to spoken audio
RerankPOST /v1/rerankRerank documents by relevance
Responses APIPOST /v1/responsesOpenAI Responses API-compatible format
RealtimeGET /v1/realtime (WebSocket)Real-time interactions
ModelsGET /v1/modelsList available models
Claude MessagesPOST /v1/messagesClaude native format (Anthropic Messages API)

Native Format Support

In addition to the OpenAI-compatible endpoints above, NexLLM also supports provider-native request formats:
Claude native format — Send requests to POST /v1/messages using the x-api-key: sk-... header and anthropic-version: 2023-06-01 header. This lets you use the Anthropic Messages API format directly, without converting to OpenAI-style payloads.
Gemini native format — Send requests to GET /v1beta/models/{model}:generateContent?key=sk-... using the standard Gemini query-parameter authentication. Replace {model} with the model name (e.g. gemini-2.5-flash).

Explore the API

Authentication

Learn how to authenticate your requests using the Authorization header or native provider headers.

Chat Completions

Generate conversational AI responses with full streaming support.

Embeddings

Convert text into vector embeddings for semantic search and RAG pipelines.

Image Generation

Generate and edit images from natural language prompts.

Audio

Transcribe audio files to text or synthesize speech from text.

Models

Retrieve the full list of models available to your API key.