Anthropic Messages API¶
SMG supports the Anthropic Messages API (/v1/messages), enabling applications to use Claude models through the gateway. Both HTTP proxy mode (forwarding to Anthropic's API) and gRPC mode (routing to local inference backends) are supported.
Endpoint¶
Create a message.
For streaming responses, set "stream": true in the request body.
Request Example¶
curl http://localhost:30000/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "What is the meaning of life?"}
]
}'
Streaming¶
To receive responses as Server-Sent Events, set "stream": true:
curl http://localhost:30000/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "What is the meaning of life?"}
],
"stream": true
}'
gRPC Backend¶
The Messages API works with gRPC backends such as SGLang and vLLM. When routing to a gRPC backend, SMG translates the Anthropic message format to the backend's native format and translates the response back.
Note
When using the Messages API with gRPC backends, SMG handles format translation automatically. The backend receives requests in its native format.
Connection Modes¶
| Mode | Backend | Description |
|---|---|---|
| HTTP (proxy) | Anthropic API | Forward requests to api.anthropic.com |
| gRPC | SGLang/vLLM | Translate and route to local inference |
Features¶
- Streaming and non-streaming responses
- Tool use (via MCP integration)
- Extended thinking
- Multi-turn conversations