Deployment

This guide shows how to deploy Arch directly using Docker without the archgw CLI, including basic runtime checks for routing and health monitoring.

Docker Deployment

Below is a minimal, production-ready example showing how to deploy the Arch Docker image directly and run basic runtime checks. Adjust image names, tags, and the arch_config.yaml path to match your environment.

Note

You will need to pass all required environment variables that are referenced in your arch_config.yaml file.

For arch_config.yaml, you can use any sample configuration defined earlier in the documentation. For example, you can try the LLM Routing sample config.

Docker Compose Setup

Create a docker-compose.yml file with the following configuration:

# docker-compose.yml
services:
  archgw:
    image: katanemo/archgw:0.3.16
    container_name: archgw
    ports:
      - "10000:10000" # ingress (client -> arch)
      - "12000:12000" # egress (arch -> upstream/llm proxy)
    volumes:
      - ./arch_config.yaml:/app/arch_config.yaml:ro
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY:?error}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:?error}
      - MODEL_SERVER_PORT=51000

Starting the Stack

Start the services from the directory containing docker-compose.yml and arch_config.yaml:

# Set required environment variables and start services
OPENAI_API_KEY=xxx ANTHROPIC_API_KEY=yyy docker compose up -d

Check container health and logs:

docker compose ps
docker compose logs -f archgw

Runtime Tests

Perform basic runtime tests to verify routing and functionality.

Gateway Smoke Test

Test the chat completion endpoint with automatic routing:

# Request handled by the gateway. 'model: "none"' lets Arch decide routing
curl --header 'Content-Type: application/json' \
  --data '{"messages":[{"role":"user","content":"tell me a joke"}], "model":"none"}' \
  http://localhost:12000/v1/chat/completions | jq .model

Expected output:

"gpt-4o-2024-08-06"

Model-Based Routing

Test explicit provider and model routing:

curl -s -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Explain quantum computing"}], "model":"anthropic/claude-3-5-sonnet-20241022"}' \
  http://localhost:12000/v1/chat/completions | jq .model

Expected output:

"claude-3-5-sonnet-20241022"

Troubleshooting

Common Issues and Solutions

Environment Variables

Ensure all environment variables (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.) used by arch_config.yaml are set before starting services.

TLS/Connection Errors

If you encounter TLS or connection errors to upstream providers:

  • Check DNS resolution

  • Verify proxy settings

  • Confirm correct protocol and port in your arch_config endpoints

Verbose Logging

To enable more detailed logs for debugging:

  • Run archgw with a higher component log level

  • See the Observability guide for logging and monitoring details

  • Rebuild the image if required with updated log configuration

CI/Automated Checks

For continuous integration or automated testing, you can use the curl commands above as health checks in your deployment pipeline.