Configuration Reference
The following is a complete reference of the prompt-conifg.yml
that controls the behavior of a single instance of
the Arch gateway. We’ve kept things simple (less than 80 lines) and held off on exposing additional functionality (for
e.g. suppporting push observability stats, managing prompt-endpoints as virtual cluster, exposing more load balancing
options, etc). Our belief that the simple things, should be simple. So we offert good defaults for developers, so
that they can spend more of their time in building features unique to their AI experience.
1version: v0.1
2
3listener:
4 address: 0.0.0.0 # or 127.0.0.1
5 port: 10000
6 # Defines how Arch should parse the content from application/json or text/pain Content-type in the http request
7 message_format: huggingface
8 common_tls_context: # If you configure port 443, you'll need to update the listener with your TLS certificates
9 tls_certificates:
10 - certificate_chain:
11 filename: /etc/certs/cert.pem
12 private_key:
13 filename: /etc/certs/key.pem
14
15# Arch creates a round-robin load balancing between different endpoints, managed via the cluster subsystem.
16endpoints:
17 app_server:
18 # value could be ip address or a hostname with port
19 # this could also be a list of endpoints for load balancing
20 # for example endpoint: [ ip1:port, ip2:port ]
21 endpoint: 127.0.0.1:80
22 # max time to wait for a connection to be established
23 connect_timeout: 0.005s
24
25 mistral_local:
26 endpoint: 127.0.0.1:8001
27
28 error_target:
29 endpoint: error_target_1
30
31# Centralized way to manage LLMs, manage keys, retry logic, failover and limits in a central way
32llm_providers:
33 - name: OpenAI
34 provider: openai
35 access_key: $OPENAI_API_KEY
36 model: gpt-4o
37 default: true
38 stream: true
39 rate_limits:
40 selector: #optional headers, to add rate limiting based on http headers like JWT tokens or API keys
41 http_header:
42 name: Authorization
43 value: "" # Empty value means each separate value has a separate limit
44 limit:
45 tokens: 100000 # Tokens per unit
46 unit: minute
47
48 - name: Mistral8x7b
49 provider: mistral
50 access_key: $MISTRAL_API_KEY
51 model: mistral-8x7b
52
53 - name: MistralLocal7b
54 provider: local
55 model: mistral-7b-instruct
56 endpoint: mistral_local
57
58# provides a way to override default settings for the arch system
59overrides:
60 # By default Arch uses an NLI + embedding approach to match an incomming prompt to a prompt target.
61 # The intent matching threshold is kept at 0.80, you can overide this behavior if you would like
62 prompt_target_intent_matching_threshold: 0.60
63
64# default system prompt used by all prompt targets
65system_prompt: You are a network assistant that just offers facts; not advice on manufacturers or purchasing decisions.
66
67prompt_guards:
68 input_guards:
69 jailbreak:
70 on_exception:
71 message: Looks like you're curious about my abilities, but I can only provide assistance within my programmed parameters.
72
73prompt_targets:
74 - name: information_extraction
75 default: true
76 description: handel all scenarios that are question and answer in nature. Like summarization, information extraction, etc.
77 endpoint:
78 name: app_server
79 path: /agent/summary
80 # Arch uses the default LLM and treats the response from the endpoint as the prompt to send to the LLM
81 auto_llm_dispatch_on_response: true
82 # override system prompt for this prompt target
83 system_prompt: You are a helpful information extraction assistant. Use the information that is provided to you.
84
85 - name: reboot_network_device
86 description: Reboot a specific network device
87 endpoint:
88 name: app_server
89 path: /agent/action
90 parameters:
91 - name: device_id
92 type: str
93 description: Identifier of the network device to reboot.
94 required: true
95 - name: confirmation
96 type: bool
97 description: Confirmation flag to proceed with reboot.
98 default: false
99 enum: [true, false]
100
101error_target:
102 endpoint:
103 name: error_target_1
104 path: /error
105
106tracing:
107 # sampling rate. Note by default Arch works on OpenTelemetry compatible tracing.
108 sampling_rate: 0.1