NexLLM Quickstart: Make Your First API Call in Minutes

NexLLM is fully compatible with the OpenAI API schema. To start using it, replace the OpenAI base_url in your existing code or client with https://www.nexllm.ai/v1 and use your NexLLM API key as the api_key. That’s the only change required — no new SDKs, no new request formats.

Not ready to write code yet? The built-in Playground lets you chat with any available model directly in your browser. It’s a great way to explore model behaviour and verify that your token is working before you integrate the API into your application.

Option 1: Use the Playground

The Playground is a no-code, in-browser testing tool that lets you interact with models immediately after generating an API key.

Open the Playground

Select a model

Use the model selector in the bottom-right corner to choose the model you want to test.

Select a channel group

Choose the channel group you want to use for the request. This determines which upstream configuration and pricing tier your request is routed through.

Send a message

Type your message in the input box at the bottom of the page and click Send. The model’s response appears in the conversation area above.

Option 2: Use curl

You can make API calls directly from your terminal. Copy the command below, replace $NEXLLM_TOKEN with your API key, and run it:

curl https://www.nexllm.ai/v1/chat/completions \
  -H "Authorization: Bearer $NEXLLM_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "aws/claude-haiku-4-5",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Write a short welcome message for a new user."}
    ],
    "max_tokens": 100
  }'

The response is returned as JSON in the standard OpenAI chat completions format.

Option 3: Use an SDK or native client

For application code, use the OpenAI Python SDK or send requests in each provider’s native format. Both approaches are shown below.

from openai import OpenAI

client = OpenAI(
    api_key="sk-xxxxxxxxxxxxxxxx",  # Your NexLLM API key
    base_url="https://www.nexllm.ai/v1"
)

response = client.chat.completions.create(
    model="aws/claude-haiku-4-5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Write a short welcome message for a new user."}
    ],
    max_tokens=100
)

print(response.choices[0].message.content)

The examples above cover the most common use case — chat completions. NexLLM also supports embeddings, image generation, audio transcription, text-to-speech, reranking, and more. See the API Reference for the full list of supported endpoints and their request schemas.

​Option 1: Use the Playground

​Option 2: Use curl

​Option 3: Use an SDK or native client

Option 1: Use the Playground

Option 2: Use curl

Option 3: Use an SDK or native client