feature: Optimize the placeholder resolution logic in plugin configuration#3383
feature: Optimize the placeholder resolution logic in plugin configuration#3383AYue-94 wants to merge 1 commit intoalibaba:mainfrom
Conversation
…ation(alibaba#3382) When the kubernetes secret cannot be found, keep the placeholder unchanged Signed-off-by: ayue <ericyu0421@163.com>
johnlanni
left a comment
There was a problem hiding this comment.
不建议这样修改,这是把控制面风险变成数据面风险,例如因为secret误删除,导致下发错误的配置,从而产生403。
建议在配置下发流程上做管理,控制链路提前确保对应的配置项存在。结合日志观测,在出现相应错误时及时介入。
|
问题在于,如果单个secret配置错误,会导致整个插件的占位符不进行替换,瞬间导致整个ai-proxy或key-auth相关的路由全部报错。这个影响范围太大了,虽然可以在配的时候校验存在,但正如你所说,如果secret误删除,也会导致整个插件占位符不替换 The problem is that if a single secret is configured incorrectly, the placeholders of the entire plug-in will not be replaced, causing all ai-proxy or key-auth-related routes to report errors in an instant. The scope of this influence is too large. Although the existence can be verified during configuration, as you said, if the secret is accidentally deleted, it will also cause the entire plug-in placeholder to not be replaced |
|
@johnlanni 举个例子,因为配了一个不存在的占位符,下面这种情况,整个ai-proxy插件里的占位符就不替换了,期望占位符正确的应该能正常替换。
Step1,正确配置ai代理,使用存在的secret${secret.higress-system/llm.aliyun-token} $ kubectl get wasmplugin ai-proxy.internal -n higress-system -oyaml
apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
name: ai-proxy.internal
namespace: higress-system
spec:
defaultConfig:
providers:
- apiTokens:
- ${secret.higress-system/llm.aliyun-token}
failover:
enabled: false
id: aliyun
openaiCustomUrl: https://dashscope.aliyuncs.com/compatible-mode/v1
openaiExtraCustomUrls: []
retryOnFailure:
enabled: false
type: openai
defaultConfigDisable: false
failStrategy: FAIL_OPEN
imagePullPolicy: UNSPECIFIED_POLICY
matchRules:
- config:
activeProviderId: aliyun
configDisable: false
service:
- llm-aliyun.internal.dns
phase: UNSPECIFIED_PHASE
priority: 100
url: oci://higress-registry.cn-hangzhou.cr.aliyuncs.com/plugins/ai-proxy:2.0.0获取envoy配置 $ kubectl exec higress-gateway-865ffb6bc4-2qhxs -n higress-system -- curl -s http://127.0.0.1:15000/config_dump | yq eval -P - > envoy.yaml
{"_rules_":[{"_match_service_":["llm-aliyun.internal.dns"],"activeProviderId":"aliyun"}],"providers":[{"apiTokens":["正确的密钥"],"failover":{"enabled":false},"id":"aliyun","openaiCustomUrl":"https://dashscope.aliyuncs.com/compatible-mode/v1","openaiExtraCustomUrls":[],"retryOnFailure":{"enabled":false},"type":"openai"}]}Step2,增加一个ai供应商,配置不存在的secret占位符${secret.higress-system/llm.aliyun-token-invalid}
higress-core会有异常日志: error ingress Failed to process template for config higress-system/ai-proxy.internal: failed to get secret value for higress-system/llm.aliyun-token-invalid: key aliyun-token-invalid not found in secret higress-system/llm查看wasmplugin配置正常: $ kubectl get wasmplugin ai-proxy.internal -n higress-system -oyaml
apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
name: ai-proxy.internal
namespace: higress-system
spec:
defaultConfig:
providers:
- apiTokens:
- ${secret.higress-system/llm.aliyun-token}
failover:
enabled: false
id: aliyun
openaiCustomUrl: https://dashscope.aliyuncs.com/compatible-mode/v1
openaiExtraCustomUrls: []
retryOnFailure:
enabled: false
type: openai
- apiTokens:
- ${secret.higress-system/llm.aliyun-token-invalid}
failover:
enabled: false
id: aliyun-invalid
openaiCustomUrl: https://dashscope.aliyuncs.com/compatible-mode/v1
openaiExtraCustomUrls: []
retryOnFailure:
enabled: false
type: openai
defaultConfigDisable: false
failStrategy: FAIL_OPEN
imagePullPolicy: UNSPECIFIED_POLICY
matchRules:
- config:
activeProviderId: aliyun-invalid
configDisable: false
service:
- llm-aliyun-invalid.internal.dns
- config:
activeProviderId: aliyun
configDisable: false
service:
- llm-aliyun.internal.dns
phase: UNSPECIFIED_PHASE
priority: 100
url: oci://higress-registry.cn-hangzhou.cr.aliyuncs.com/plugins/ai-proxy:2.0.0查看envoy配置,全是占位符,但是期望至少让正常配置的${secret.higress-system/llm.aliyun-token}能替换: $ kubectl exec higress-gateway-865ffb6bc4-2qhxs -n higress-system -- curl -s http://127.0.0.1:15000/config_dump | yq eval -P - > envoy.yaml
{"_rules_":[{"_match_service_":["llm-aliyun-invalid.internal.dns"],"activeProviderId":"aliyun-invalid"},{"_match_service_":["llm-aliyun.internal.dns"],"activeProviderId":"aliyun"}],"providers":[{"apiTokens":["${secret.higress-system/llm.aliyun-token}"],"failover":{"enabled":false},"id":"aliyun","openaiCustomUrl":"https://dashscope.aliyuncs.com/compatible-mode/v1","openaiExtraCustomUrls":[],"retryOnFailure":{"enabled":false},"type":"openai"},{"apiTokens":["${secret.higress-system/llm.aliyun-token-invalid}"],"failover":{"enabled":false},"id":"aliyun-invalid","openaiCustomUrl":"https://dashscope.aliyuncs.com/compatible-mode/v1","openaiExtraCustomUrls":[],"retryOnFailure":{"enabled":false},"type":"openai"}]} |
这个不太符合预期,如果secret变量替换失败,理论上应该阻塞当前配置下发才对 |
@johnlanni 如果重启了,这个配置也不下发的话,也会有问题吧 |
@johnlanni 对的,controller是可以正常启动的 |
|
@AYue-94 controller肯定不会阻塞的,要看下现在是否会阻塞配置推送给gateway,如果不推送,gateway重启的时候就会阻塞住,不会提供基于错误配置的服务 @AYue-94 The controller will definitely not be blocked. We need to see if it will block the configuration push to the gateway. If not, the gateway will be blocked when it restarts and will not provide services based on incorrect configurations |

Ⅰ. Describe what this PR did
When the kubernetes secret cannot be found, keep ${} unchanged, and monitor secret changes normally.
Ⅱ. Does this pull request fix one issue?
#3382
Ⅲ. Why don't you add test cases (unit test/integration test)?
Ⅳ. Describe how to verify it
unit test
Ⅴ. Special notes for reviews
Ⅵ. AI Coding Tool Usage Checklist (if applicable)
Please check all applicable items:
For new standalone features (e.g., new wasm plugin or golang-filter plugin):
design/directory in the plugin folderdesign/directoryFor regular updates/changes (not new plugins):
AI Coding Prompts (for regular updates)
AI Coding Summary