Esc to close · ⌘K / Ctrl-K opens search anywhere
Monitoring turns a collection into a watched routing chain. When a collection is monitored, the gateway runs a tiny server-side canary against every step on a schedule — roughly every 5 minutes — and records whether each step is up and how fast it answered. That history drives the per-step uptime and latency shown on the collection page, and feeds the alert thresholds you configure.
Free during beta. Canaries run gateway-internal — a single one-token ping per step — and never debit your wallet.
Toggle monitoring for a collection you own (owner/admin). The first canary runs immediately so the page isn't empty; scheduled checks follow.
POST /me/collections/:slug/monitor
{ "monitored": true } // omit or false to turn off
→ { "ok": true, "monitored": true,
"note": "Monitoring on — first canary running now; scheduled checks follow." } Each canary picks the same route a real request would and sends a one-token ping
to the step's model/provider, recording up/down and latency. GET /me/collections/:slug/health
returns, per step, the uptime and p95 latency over a window
(default 7 days; ?days= 1–90) and the last canary time. Any
member can read it.
GET /me/collections/india-first-llama/health?days=7
→ { "monitored": true,
"health": [
{ "model": "llama-3.1-8b-instruct", "provider": "vllm",
"uptime": 0.998, "p95_ms": 820, "checks": 2014,
"last_check": "2026-06-15T09:35:00Z" },
{ "model": "llama-3.1-8b-instruct", "provider": "krutrim",
"uptime": 1.0, "p95_ms": 540, "checks": 2014, "last_check": "..." }
] } uptime is a fraction 0–1 (successful canaries ÷ total) and p95_ms the
95th-percentile latency over the window; both are null until the first checks land.
Run a check on demand with POST /me/collections/:slug/check, which canaries every step
now, evaluates alerts, and returns fresh health.
An alert fires when a metric crosses your threshold over a recent window. On breach
BharatRouter sends an email and/or POSTs a webhook, then stays
quiet for that alert for 24 hours (dedupe). Owners/admins manage alerts;
you must supply notify_email and/or webhook_url.
| Metric | Threshold | Fires when |
|---|---|---|
error_rate | fraction 0–1 | failed canaries ÷ total over the window ≥ threshold. |
latency_p95 | ms | p95 latency over the window ≥ threshold. |
POST /me/collections/:slug/alerts
{
"metric": "error_rate", // or "latency_p95"
"threshold": 0.1, // error_rate: 0–1 | latency_p95: ms
"window_min": 60, // window in minutes, 5–1440 (default 60)
"notify_email": "oncall@acme.in",
"webhook_url": "https://hooks.acme.in/br" // optional, public http(s) only
}
→ { "ok": true, "alert": { "id": 3, "metric": "error_rate", "threshold": 0.1,
"window_min": 60, "notify_email": "oncall@acme.in", "webhook_url": null } } The webhook_url must be a public http(s) endpoint — the same SSRF
guard as BYOE rejects loopback, private and cluster-internal
hosts. On breach the gateway POSTs a compact JSON event you can route to Slack, Discord or a
custom ingest:
POST <your webhook> Content-Type: application/json
{ "event": "alert", "collection": "india-first-llama", "name": "India-first Llama",
"metric": "error_rate", "threshold": 0.1,
"breach": "error rate 22% ≥ 10% (last 60m)",
"at": "2026-06-15T09:40:00Z",
"text": "⚠ BharatRouter: \"India-first Llama\" — error rate 22% ≥ 10% (last 60m)" } | Endpoint | What it does |
|---|---|
GET /me/collections/:slug/alerts | List the collection's alerts. |
POST /me/collections/:slug/alerts | Create an alert (owner/admin). |
DELETE /me/collections/:slug/alerts/:id | Remove an alert (owner/admin). |
The same surface is available over MCP for agents:
| Tool | What it does |
|---|---|
get_collection_health | Per-step 7-day uptime, p95 latency and last-canary time (read-only). |
set_monitoring | Turn monitoring on/off (write — user_confirmed: true). |
run_monitor_check | Canary every step now and return fresh health (write — user_confirmed: true). |
set_monitor_alert | Add an alert on error_rate or latency_p95 (write — user_confirmed: true). |
remove_monitor_alert | Remove an alert by id (write — user_confirmed: true). |