Control Plane Operations¶
This guide covers day-2 admin workflows backed by control-plane endpoints in model_gateway/src/server.rs: workers, tokenizers, WASM modules, parser utilities, and cache/load operations.
Before you begin¶
- Completed Control Plane Auth
- Set an admin bearer token (JWT or API key), for example:
export ADMIN_TOKEN=super-secret-key
Auth Header¶
Use the same header for all control-plane calls:
Control-plane middleware requires admin role for these operations.
1. Worker Management¶
Create worker:
curl http://localhost:30000/workers \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-d '{
"url": "grpc://localhost:50051",
"model_id": "meta-llama/Llama-3.1-8B-Instruct",
"worker_type": "regular",
"runtime": "sglang"
}'
List workers:
Get/update/delete by worker ID:
curl http://localhost:30000/workers/<worker_id> \
-H "Authorization: Bearer ${ADMIN_TOKEN}"
curl -X PUT http://localhost:30000/workers/<worker_id> \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-d '{"priority": 100}'
curl -X DELETE http://localhost:30000/workers/<worker_id> \
-H "Authorization: Bearer ${ADMIN_TOKEN}"
2. Tokenizer Registry¶
Add tokenizer:
curl -X POST http://localhost:30000/v1/tokenizers \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-d '{
"name": "llama3-main",
"source": "meta-llama/Llama-3.1-8B-Instruct"
}'
List/get/status/delete:
curl http://localhost:30000/v1/tokenizers \
-H "Authorization: Bearer ${ADMIN_TOKEN}"
curl http://localhost:30000/v1/tokenizers/<tokenizer_id> \
-H "Authorization: Bearer ${ADMIN_TOKEN}"
curl http://localhost:30000/v1/tokenizers/<tokenizer_id>/status \
-H "Authorization: Bearer ${ADMIN_TOKEN}"
curl -X DELETE http://localhost:30000/v1/tokenizers/<tokenizer_id> \
-H "Authorization: Bearer ${ADMIN_TOKEN}"
3. WASM Module Management¶
Enable WASM support at startup:
Register module:
curl -X POST http://localhost:30000/wasm \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-d '{
"modules": [
{
"name": "audit-middleware",
"file_path": "/opt/wasm/audit.wasm",
"module_type": "Middleware",
"attach_points": [{"Middleware":"OnRequest"}]
}
]
}'
List/remove modules:
curl http://localhost:30000/wasm \
-H "Authorization: Bearer ${ADMIN_TOKEN}"
curl -X DELETE http://localhost:30000/wasm/<module_uuid> \
-H "Authorization: Bearer ${ADMIN_TOKEN}"
4. Parser Utilities¶
Function call parsing:
curl -X POST http://localhost:30000/parse/function_call \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-d '{
"text": "{\"name\":\"get_weather\",\"arguments\":{\"city\":\"SF\"}}",
"tool_call_parser": "json",
"tools": []
}'
Reasoning separation:
curl -X POST http://localhost:30000/parse/reasoning \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-d '{
"text": "<think>internal</think>answer",
"reasoning_parser": "deepseek_r1"
}'
5. Cache and Load Operations¶
Flush worker KV caches:
Inspect worker loads: