|
2 | 2 | conversation: |
3 | 3 | - eval_id: basic_introduction |
4 | 4 | eval_query: Hi! |
5 | | - eval_types: [response_eval:accuracy] |
6 | | - expected_response: Hi! I'm the Red Hat OpenShift Lightspeed Intelligent Assistant, and I'm here to guide you through installing OpenShift using the Assisted Installer. |
7 | | - description: Basic greeting test using keyword matching for reliability (avoids LLM judge flapping) |
| 5 | + eval_types: [response_eval:intent] |
| 6 | + expected_intent: A basic greeting that indicates willingess to help with installing OpenShift |
8 | 7 |
|
9 | 8 | - conversation_group: basic_cluster_request_conv |
10 | 9 | conversation: |
|
56 | 55 | conversation: |
57 | 56 | - eval_id: create_eval_test_sno |
58 | 57 | eval_query: create a new single node cluster named eval-test-singlenode-uniq-cluster-name, running on version 4.19.7 with the x86_64 CPU architecture, configured under the base domain example.com, using the provided SSH key "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCmeaBFhSJ/MLECmqUaKweRgo10ABpwdvJ7v76qLYfP0pzfzYsF3hGP/fH5OQfHi9pTbWynjaEcPHVfaTaFWHvyMtv8PEMUIDgQPWlBSYzb+3AgQ5AsChhzTJCYnRdmCdzENlV+azgtb3mVfXiyCfjxhyy3QAV4hRrMaVtJGuUQfQ== example@example.com". |
59 | | - eval_types: [tool_eval, response_eval:sub-string, response_eval:accuracy, action_eval] |
60 | | - expected_tool_calls: |
61 | | - - - tool_name: create_cluster |
62 | | - arguments: |
63 | | - name: "eval-test-singlenode-uniq-cluster-name" |
64 | | - version: "4\\.19\\.7" |
65 | | - base_domain: "example\\.com" |
66 | | - single_node: "(?i:true)" |
67 | | - cpu_architecture: "x86_64" |
68 | | - ssh_public_key: 'ssh-rsa\s+[A-Za-z0-9+/]+[=]{0,3}(\s+.+)?\s*' |
69 | | - platform: "none" |
70 | | - eval_verify_script: ../scripts/cluster_created_check.sh |
| 58 | + eval_types: [response_eval:sub-string, response_eval:accuracy, action_eval] |
| 59 | + eval_verify_script: ../scripts/verify_create_eval_test_sno.sh |
71 | 60 | expected_keywords: ["eval-test-singlenode-uniq-cluster-name", "ID", "Discovery ISO", "download", "cluster"] |
72 | 61 | expected_response: I have created a cluster with name eval-test-singlenode-uniq-cluster-name. Next, you'll need to download the Discovery ISO, then boot your hosts with it. Would you like me to get the Discovery ISO download URL? |
73 | 62 | - eval_id: get_iso_eval_test_sno |
|
84 | 73 | cleanup_script: ../scripts/delete_cluster.sh |
85 | 74 | conversation: |
86 | 75 | - eval_id: create_eval_test_multinode |
87 | | - eval_types: [tool_eval, response_eval:accuracy, response_eval:sub-string, action_eval] |
| 76 | + eval_types: [response_eval:accuracy, response_eval:sub-string, action_eval] |
88 | 77 | eval_query: Create a multi-node cluster named 'eval-test-multinode-uniq-cluster-name' with OpenShift 4.18.22 and domain test.local and with the x86_64 CPU architecture. |
89 | | - expected_tool_calls: |
90 | | - - - tool_name: create_cluster |
91 | | - arguments: |
92 | | - name: "eval-test-multinode-uniq-cluster-name" |
93 | | - version: "4\\.18\\.22" |
94 | | - base_domain: "test\\.local" |
95 | | - single_node: "(?i:false)" |
96 | | - cpu_architecture: "x86_64" |
97 | | - ssh_public_key: "" |
98 | | - platform: "baremetal" |
99 | | - eval_verify_script: ../scripts/cluster_created_check.sh |
| 78 | + eval_verify_script: ../scripts/verify_create_eval_test_multinode.sh |
100 | 79 | expected_keywords: ["eval-test-multinode-uniq-cluster-name", "ID", "Discovery ISO", "cluster"] |
101 | 80 | expected_response: I have created a cluster with name eval-test-multinode-uniq-cluster-name. Before downloading the Discovery ISO, would you like to configure static network configuration for the hosts? If not, the cluster will use Dynamic Host Configuration Protocol (DHCP) for host networking configuration by default. |
102 | 81 | - eval_id: set_ssh_key_eval_test_ssh |
|
135 | 114 | - eval_id: cluster_info_tool_call |
136 | 115 | eval_query: Give me details about cluster named 'abc123abc' |
137 | 116 | description: Test handling for non-existent cluster name |
138 | | - eval_types: [tool_eval, response_eval:accuracy] |
139 | | - expected_response: Retrieval failed for cluster 'abc123abc' because the resource was not found. |
| 117 | + eval_types: [tool_eval, response_eval:intent] |
| 118 | + expected_intent: An explanation indicating that a cluster with the name abc123abc does not exist |
140 | 119 | expected_tool_calls: |
141 | 120 | - - tool_name: list_clusters |
142 | 121 | arguments: {} |
|
161 | 140 | conversation: |
162 | 141 | - eval_id: refuse_azure_request |
163 | 142 | eval_query: Can you create a cluster on Azure? |
164 | | - eval_types: [response_eval:accuracy] |
165 | | - expected_response: Sorry, I can't create clusters directly on public clouds like Azure. I am designed to assist with on-premise OpenShift installations using the Assisted Installer. For installing OpenShift on Azure, you can explore Red Hat OpenShift Service on Azure or use the standard OpenShift Installer. |
| 143 | + eval_types: [response_eval:intent] |
| 144 | + expected_intent: An explanation that indicates clusters cannot be created on public clouds and a reminder that the scope of this installer is on-premise installations |
166 | 145 | description: Refuse out-of-scope request to create clusters on Azure; clarify on-prem scope and Assisted Installer |
167 | 146 |
|
168 | 147 | - conversation_group: capabilities_scope_conv |
169 | 148 | conversation: |
170 | 149 | - eval_id: supported_platforms_summary |
171 | 150 | eval_query: Which platforms can you help me install on? |
172 | | - eval_types: [response_eval:accuracy] |
173 | | - expected_response: I can help you install OpenShift on-premise using the Assisted Installer, either on bare metal servers or virtual machines (VMs) like vSphere, KVM or libvirt. I do not support public cloud platforms like Amazon Web Services (AWS), Azure, or Google Cloud Platform (GCP). |
| 151 | + eval_types: [response_eval:intent] |
| 152 | + expected_intent: An explanation that indicates on-premise platforms are supported, but public cloud platforms are not. |
174 | 153 | description: Ensure the assistant concisely states supported on-prem baremetal/VM scope and references Assisted Installer |
175 | 154 |
|
176 | 155 | - conversation_group: cluster_id_from_name |
|
195 | 174 | conversation: |
196 | 175 | - eval_id: create_eval_test_sno |
197 | 176 | eval_query: create a new single node cluster named eval-test-static-net-cluster-uniq-cluster-name, running on version 4.19.7 with the x86_64 CPU architecture, configured under the base domain example.com, without an ssh key. |
198 | | - eval_types: [tool_eval, action_eval] |
199 | | - eval_verify_script: ../scripts/cluster_created_check.sh |
200 | | - expected_tool_calls: |
201 | | - - - tool_name: create_cluster |
202 | | - arguments: |
203 | | - name: "eval-test-static-net-cluster-uniq-cluster-name" |
204 | | - version: "4\\.19\\.7" |
205 | | - base_domain: "example\\.com" |
206 | | - single_node: "(?i:true)" |
207 | | - cpu_architecture: "x86_64" |
208 | | - ssh_public_key: "" |
209 | | - platform: "none" |
| 177 | + eval_types: [action_eval] |
| 178 | + eval_verify_script: ../scripts/verify_create_eval_test_sno_static_net.sh |
210 | 179 | - eval_id: configure_hosts |
211 | 180 | eval_query: | |
212 | 181 | I want to configure static networking. Create configs with a single vlan interface backed by an ethernet interface. It should have an ethernet interface with mac address c5:d6:bc:f0:05:20, and the vlan interface has ip address 10.0.0.5/24. The vlan id is 400. Use the name eth0 for the ethernet interface and vlan0 as the name of the vlan interface. Also I want DNS config with a DNS server 8.8.8.8. |
213 | | - eval_types: [tool_eval] |
214 | | - expected_tool_calls: |
215 | | - - - tool_name: generate_nmstate_yaml |
216 | | - arguments: |
217 | | - params: |- |
218 | | - (?s)^(?=.*"ethernet_ifaces":\s*\[\s*\{(?=.*"mac_address":\s*"c5:d6:bc:f0:05:20")(?=.*"name":\s*"eth0").*?\}\s*\])(?=.*"vlan_ifaces":\s*\[\s*\{(?=.*?"vlan_id":\s*400\b)(?=.*?"name":\s*"vlan0")(?=.*?"base_interface_name":\s*"eth0")(?=.*?"ipv4_address":\s*\{(?=.*?"address":\s*"10\.0\.0\.5")(?=.*?"cidr_length":\s*24\b).*?\}).*?\}\s*\])(?=.*"dns":\s*\{\s*"dns_servers":\s*\[\s*"8\.8\.8\.8"\s*\]\s*\}).*$ |
| 182 | + eval_types: [response_eval:intent] |
| 183 | + expected_intent: Acknowledgement of the desired configuration and a request for the user to validate the configuration before applying it. |
219 | 184 | - eval_id: apply_to_cluster |
220 | 185 | eval_query: Yes, apply it to the cluster you just created. |
221 | | - eval_types: [tool_eval] |
222 | | - expected_tool_calls: |
223 | | - - - tool_name: alter_static_network_config_nmstate_for_host |
224 | | - arguments: |
225 | | - cluster_id: "[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}" |
226 | | - index: null |
227 | | - new_nmstate_yaml: "10.0.0.5" |
| 186 | + eval_types: [action_eval] |
| 187 | + eval_verify_script: ../scripts/verify_static_net_apply_to_cluster.sh |
0 commit comments