Esc to close · ⌘K / Ctrl-K opens search anywhere
BYOE lets you plug your own OpenAI-compatible deployment into BharatRouter's
routing — a self-hosted vLLM, an internal gateway, a private cloud endpoint — and use it as
a fallback step in a chain. Because the inference runs on
your infrastructure, BYOE steps are priced at ₹0. An endpoint is referenced
in a chain as byoe:<slug>, and you can register more than one per model.
Both are "bring your own", and both ride your infra/account at ₹0 — but they bring different things. Use BYOK when the model already runs on a provider we support and you just want to supply your own key; use BYOE when the model runs on a deployment you host.
| BYOK — bring your own key | BYOE — bring your own endpoint | |
|---|---|---|
| You supply | A provider API key | A URL to a deployment you run |
| Where it runs | The provider's infra | Your infra |
| Provider must be | One BharatRouter already supports | Any OpenAI-compatible endpoint |
| Admission gate | Live key verification on save | Automated OpenAI-format compliance test |
| Referenced as | provider (e.g. krutrim) or provider/model-id | byoe:<slug> |
| How many | One saved key per provider | Multiple endpoints per model |
| Price | ₹0 (provider bills you) | ₹0 (your own infra) |
See BYOK for using your own provider keys.
Registration runs an automated compliance test inline and saves the endpoint only if it passes. Owners and admins only.
POST /me/endpoints
{
"name": "my-vllm-llama",
"model": "llama-3.1-8b-instruct", // a catalog chat model
"base_url": "https://llm.mycorp.com/v1",
"upstream_model": "meta-llama/Llama-3.1-8B-Instruct",
"key": "sk-internal-...", // optional — omit for a keyless endpoint
"residency": "india"
}
→ { "ok": true, "endpoint": { "id": 7, "slug": "my-vllm-llama",
"provider_ref": "byoe:my-vllm-llama", "model": "llama-3.1-8b-instruct" },
"compliance": { "pass": true, ... } } Use the returned provider_ref (byoe:<slug>) as the
provider of a step when you save a fallback
chain or build a collection.
POST /me/endpoints/test runs the compliance check against an unsaved config
and returns the result without writing anything — useful for dialing in base_url
and upstream_model. Any member may test.
An endpoint is admitted to live routing only after it proves it speaks the chat-completions wire format. The test (12-second timeout) checks:
| Check | Requirement |
|---|---|
| Chat | Required — a non-stream call returns choices[0].message.content. |
| Usage | Required — the reply carries numeric usage.prompt_tokens and usage.completion_tokens. |
| Stream | Required — a streamed call yields SSE deltas and ends with [DONE]. |
| Stream usage | Advisory — whether streamed replies include a usage block (sets supports_stream_usage). |
| Error shape | Advisory — a bad request returns 4xx with an { error } object. |
Re-run the check on a saved endpoint with POST /me/endpoints/:id/retest (for
example after it comes back online). Only endpoints whose compliance is pass are
used for routing; others are stored but skipped.
base_url must be a public http(s)
endpoint. Loopback, link-local, private (RFC1918), and cluster-internal hosts
(.local, .internal, .svc, .cluster.local,
multicast and ULA ranges) are rejected.| Endpoint | What it does |
|---|---|
GET /me/endpoints | List your endpoints (keys masked, with compliance status). |
POST /me/endpoints | Register + compliance-test inline (owner/admin). |
POST /me/endpoints/test | Compliance-test an unsaved config (no write). |
POST /me/endpoints/:id/retest | Re-run compliance on a saved endpoint (owner/admin). |
DELETE /me/endpoints/:id | Remove an endpoint (owner/admin). |
Over MCP: list_endpoints and test_endpoint
(read/no-write), plus register_endpoint and remove_endpoint (write —
require user_confirmed: true).