Bring your own endpoint

BYOE lets you plug your own OpenAI-compatible deployment into BharatRouter's routing — a self-hosted vLLM, an internal gateway, a private cloud endpoint — and use it as a fallback step in a chain. Because the inference runs on your infrastructure, BYOE steps are priced at ₹0. An endpoint is referenced in a chain as byoe:<slug>, and you can register more than one per model.

BYOE vs BYOK

Both are "bring your own", and both ride your infra/account at ₹0 — but they bring different things. Use BYOK when the model already runs on a provider we support and you just want to supply your own key; use BYOE when the model runs on a deployment you host.

	BYOK — bring your own key	BYOE — bring your own endpoint
You supply	A provider API key	A URL to a deployment you run
Where it runs	The provider's infra	Your infra
Provider must be	One BharatRouter already supports	Any OpenAI-compatible endpoint
Admission gate	Live key verification on save	Automated OpenAI-format compliance test
Referenced as	`provider` (e.g. `krutrim`) or `provider/model-id`	`byoe:<slug>`
How many	One saved key per provider	Multiple endpoints per model
Price	₹0 (provider bills you)	₹0 (your own infra)

See BYOK for using your own provider keys.

Register an endpoint

Registration runs an automated compliance test inline and saves the endpoint only if it passes. Owners and admins only.

POST /me/endpoints
{
  "name": "my-vllm-llama",
  "model": "llama-3.1-8b-instruct",      // a catalog chat model
  "base_url": "https://llm.mycorp.com/v1",
  "upstream_model": "meta-llama/Llama-3.1-8B-Instruct",
  "key": "sk-internal-...",              // optional — omit for a keyless endpoint
  "residency": "india"
}
→ { "ok": true, "endpoint": { "id": 7, "slug": "my-vllm-llama",
      "provider_ref": "byoe:my-vllm-llama", "model": "llama-3.1-8b-instruct" },
    "compliance": { "pass": true, ... } }

Use the returned provider_ref (byoe:<slug>) as the provider of a step when you save a fallback chain or build a collection.

Test before you save

POST /me/endpoints/test runs the compliance check against an unsaved config and returns the result without writing anything — useful for dialing in base_url and upstream_model. Any member may test.

The compliance test

An endpoint is admitted to live routing only after it proves it speaks the chat-completions wire format. The test (12-second timeout) checks:

Check	Requirement
Chat	Required — a non-stream call returns `choices[0].message.content`.
Usage	Required — the reply carries numeric `usage.prompt_tokens` and `usage.completion_tokens`.
Stream	Required — a streamed call yields SSE deltas and ends with `[DONE]`.
Stream usage	Advisory — whether streamed replies include a usage block (sets `supports_stream_usage`).
Error shape	Advisory — a bad request returns 4xx with an `{ error }` object.

Re-run the check on a saved endpoint with POST /me/endpoints/:id/retest (for example after it comes back online). Only endpoints whose compliance is pass are used for routing; others are stored but skipped.

Security

SSRF guard. base_url must be a public http(s) endpoint. Loopback, link-local, private (RFC1918), and cluster-internal hosts (.local, .internal, .svc, .cluster.local, multicast and ULA ranges) are rejected.
Keys encrypted at rest with AES-256-GCM; keyless endpoints store no key. Saved keys are returned masked only.

Manage

Endpoint	What it does
`GET /me/endpoints`	List your endpoints (keys masked, with compliance status).
`POST /me/endpoints`	Register + compliance-test inline (owner/admin).
`POST /me/endpoints/test`	Compliance-test an unsaved config (no write).
`POST /me/endpoints/:id/retest`	Re-run compliance on a saved endpoint (owner/admin).
`DELETE /me/endpoints/:id`	Remove an endpoint (owner/admin).

From agents (MCP)

Over MCP: list_endpoints and test_endpoint (read/no-write), plus register_endpoint and remove_endpoint (write — require user_confirmed: true).