RAG Apps
The following section describes how Arch can help you build faster, smarter and more accurate Retrieval-Augmented Generation (RAG) applications, including fast and accurate RAG in multi-turn converational scenarios.
What is Retrieval-Augmented Generation (RAG)?
RAG applications combine retrieval-based methods with generative AI models to provide more accurate, contextually relevant, and reliable outputs. These applications leverage external data sources to augment the capabilities of Large Language Models (LLMs), enabling them to retrieve and integrate specific information rather than relying solely on the LLM’s internal knowledge.
Parameter Extraction for RAG
To build RAG (Retrieval Augmented Generation) applications, you can configure prompt targets with parameters, enabling Arch to retrieve critical information in a structured way for processing. This approach improves the retrieval quality and speed of your application. By extracting parameters from the conversation, you can pull the appropriate chunks from a vector database or SQL-like data store to enhance accuracy. With Arch, you can streamline data retrieval and processing to build more efficient and precise RAG applications.
Step 1: Define Prompt Targets
1prompt_targets:
2 - name: get_device_statistics
3 description: Retrieve and present the relevant data based on the specified devices and time range
4
5 path: /agent/device_summary
6 parameters:
7 - name: device_ids
8 type: list
9 description: A list of device identifiers (IDs) to reboot.
10 required: true
11 - name: time_range
12 type: int
13 description: The number of days in the past over which to retrieve device statistics
14 required: false
15 default: 7
Step 2: Process Request Parameters in Flask
Once the prompt targets are configured as above, handling those parameters is
1from flask import Flask, request, jsonify
2
3app = Flask(__name__)
4
5
6@app.route("/agent/device_summary", methods=["POST"])
7def get_device_summary():
8 """
9 Endpoint to retrieve device statistics based on device IDs and an optional time range.
10 """
11 data = request.get_json()
12
13 # Validate 'device_ids' parameter
14 device_ids = data.get("device_ids")
15 if not device_ids or not isinstance(device_ids, list):
16 return (
17 jsonify({"error": "'device_ids' parameter is required and must be a list"}),
18 400,
19 )
20
21 # Validate 'time_range' parameter (optional, defaults to 7)
22 time_range = data.get("time_range", 7)
23 if not isinstance(time_range, int):
24 return jsonify({"error": "'time_range' must be an integer"}), 400
25
26 # Simulate retrieving statistics for the given device IDs and time range
27 # In a real application, you would query your database or external service here
28 statistics = []
29 for device_id in device_ids:
30 # Placeholder for actual data retrieval
31 stats = {
32 "device_id": device_id,
33 "time_range": f"Last {time_range} days",
34 "data": f"Statistics data for device {device_id} over the last {time_range} days.",
35 }
36 statistics.append(stats)
37
38 response = {"statistics": statistics}
39
40 return jsonify(response), 200
41
42
43if __name__ == "__main__":
44 app.run(debug=True)
Multi-Turn RAG (Follow-up Questions)
Developers often struggle to efficiently handle
follow-up
or clarification
questions. Specifically, when users ask for changes or additions to previous responses, it requires developers to
re-write prompts using LLMs with precise prompt engineering techniques. This process is slow, manual, error prone and adds signifcant latency to the
user experience. Arch
Arch is highly capable of accurately detecting and processing prompts in a multi-turn scenarios so that you can buil fast and accurate RAG apps in minutes. For additional details on how to build multi-turn RAG applications please refer to our multi-turn docs.