🎬 New — watch the 2-minute guide videos →

Bring your own endpoint

BYOE lets you plug your own OpenAI-compatible deployment into BharatRouter's routing — a self-hosted vLLM, an internal gateway, a private cloud endpoint — and use it as a fallback step in a chain. Because the inference runs on your infrastructure, BYOE steps are priced at ₹0. An endpoint is referenced in a chain as byoe:<slug>, and you can register more than one per model.

BYOE vs BYOK

Both are "bring your own", and both ride your infra/account at ₹0 — but they bring different things. Use BYOK when the model already runs on a provider we support and you just want to supply your own key; use BYOE when the model runs on a deployment you host.

BYOK — bring your own keyBYOE — bring your own endpoint
You supplyA provider API keyA URL to a deployment you run
Where it runsThe provider's infraYour infra
Provider must beOne BharatRouter already supportsAny OpenAI-compatible endpoint
Admission gateLive key verification on saveAutomated OpenAI-format compliance test
Referenced asprovider (e.g. krutrim) or provider/model-idbyoe:<slug>
How manyOne saved key per providerMultiple endpoints per model
Price₹0 (provider bills you)₹0 (your own infra)

See BYOK for using your own provider keys.

Register an endpoint

Registration runs an automated compliance test inline and saves the endpoint only if it passes. Owners and admins only.

POST /me/endpoints
{
  "name": "my-vllm-llama",
  "model": "llama-3.1-8b-instruct",      // a catalog chat model
  "base_url": "https://llm.mycorp.com/v1",
  "upstream_model": "meta-llama/Llama-3.1-8B-Instruct",
  "key": "sk-internal-...",              // optional — omit for a keyless endpoint
  "residency": "india"
}
→ { "ok": true, "endpoint": { "id": 7, "slug": "my-vllm-llama",
      "provider_ref": "byoe:my-vllm-llama", "model": "llama-3.1-8b-instruct" },
    "compliance": { "pass": true, ... } }

Use the returned provider_ref (byoe:<slug>) as the provider of a step when you save a fallback chain or build a collection.

Test before you save

POST /me/endpoints/test runs the compliance check against an unsaved config and returns the result without writing anything — useful for dialing in base_url and upstream_model. Any member may test.

The compliance test

An endpoint is admitted to live routing only after it proves it speaks the chat-completions wire format. The test (12-second timeout) checks:

CheckRequirement
ChatRequired — a non-stream call returns choices[0].message.content.
UsageRequired — the reply carries numeric usage.prompt_tokens and usage.completion_tokens.
StreamRequired — a streamed call yields SSE deltas and ends with [DONE].
Stream usageAdvisory — whether streamed replies include a usage block (sets supports_stream_usage).
Error shapeAdvisory — a bad request returns 4xx with an { error } object.

Re-run the check on a saved endpoint with POST /me/endpoints/:id/retest (for example after it comes back online). Only endpoints whose compliance is pass are used for routing; others are stored but skipped.

Security

Manage

EndpointWhat it does
GET /me/endpointsList your endpoints (keys masked, with compliance status).
POST /me/endpointsRegister + compliance-test inline (owner/admin).
POST /me/endpoints/testCompliance-test an unsaved config (no write).
POST /me/endpoints/:id/retestRe-run compliance on a saved endpoint (owner/admin).
DELETE /me/endpoints/:idRemove an endpoint (owner/admin).

From agents (MCP)

Over MCP: list_endpoints and test_endpoint (read/no-write), plus register_endpoint and remove_endpoint (write — require user_confirmed: true).