Deployment
This guide shows how to deploy Arch directly using Docker without the archgw CLI, including basic runtime checks for routing and health monitoring.
Docker Deployment
Below is a minimal, production-ready example showing how to deploy the Arch Docker image directly and run basic runtime checks. Adjust image names, tags, and the arch_config.yaml
path to match your environment.
Note
You will need to pass all required environment variables that are referenced in your arch_config.yaml
file.
For arch_config.yaml
, you can use any sample configuration defined earlier in the documentation. For example, you can try the LLM Routing sample config.
Docker Compose Setup
Create a docker-compose.yml
file with the following configuration:
# docker-compose.yml
services:
archgw:
image: katanemo/archgw:0.3.16
container_name: archgw
ports:
- "10000:10000" # ingress (client -> arch)
- "12000:12000" # egress (arch -> upstream/llm proxy)
volumes:
- ./arch_config.yaml:/app/arch_config.yaml:ro
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY:?error}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:?error}
- MODEL_SERVER_PORT=51000
Starting the Stack
Start the services from the directory containing docker-compose.yml
and arch_config.yaml
:
# Set required environment variables and start services
OPENAI_API_KEY=xxx ANTHROPIC_API_KEY=yyy docker compose up -d
Check container health and logs:
docker compose ps
docker compose logs -f archgw
Runtime Tests
Perform basic runtime tests to verify routing and functionality.
Gateway Smoke Test
Test the chat completion endpoint with automatic routing:
# Request handled by the gateway. 'model: "none"' lets Arch decide routing
curl --header 'Content-Type: application/json' \
--data '{"messages":[{"role":"user","content":"tell me a joke"}], "model":"none"}' \
http://localhost:12000/v1/chat/completions | jq .model
Expected output:
"gpt-4o-2024-08-06"
Model-Based Routing
Test explicit provider and model routing:
curl -s -H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"Explain quantum computing"}], "model":"anthropic/claude-3-5-sonnet-20241022"}' \
http://localhost:12000/v1/chat/completions | jq .model
Expected output:
"claude-3-5-sonnet-20241022"
Troubleshooting
Common Issues and Solutions
- Environment Variables
Ensure all environment variables (
OPENAI_API_KEY
,ANTHROPIC_API_KEY
, etc.) used byarch_config.yaml
are set before starting services.- TLS/Connection Errors
If you encounter TLS or connection errors to upstream providers:
Check DNS resolution
Verify proxy settings
Confirm correct protocol and port in your
arch_config
endpoints
- Verbose Logging
To enable more detailed logs for debugging:
Run archgw with a higher component log level
See the Observability guide for logging and monitoring details
Rebuild the image if required with updated log configuration
- CI/Automated Checks
For continuous integration or automated testing, you can use the curl commands above as health checks in your deployment pipeline.