fix(security): sanitize secrets in user-provided logs before LLM injection#268
fix(security): sanitize secrets in user-provided logs before LLM injection#268Adar5 wants to merge 4 commits intojenkinsci:mainfrom
Conversation
berviantoleo
left a comment
There was a problem hiding this comment.
Don't mix with unrelated changes
38a0464 to
6451d3d
Compare
|
@berviantoleo Apologies for the confusion! I had a branch management issue on my end that pulled in some previous experimental commits. I have just force-pushed a clean version of the branch. The PR now contains only the sanitize_logs implementation as requested |
|
verified the fix logic,sanitize_logs() is correctly called before log_context is interpolated into the prompt string, so secrets are redacted before reaching the LLM. i think one thing worth checking is the LOG_ANALYSIS_INSTRUCTION format in otherwise the change looks clean.... |
|
I prefer the producer to sanitize it, instead of the consumer. In addition, needs to provide either unit/integration test. |
|
@Adar5 and for the test, you can add a case in test_chat_service.py that verifies secrets in a userpasted log are redacted before reaching the prompt string |
|
agree with @berviantoleo on the producer-side sanitization and just to add some context, |
|
@berviantoleo @sharma-sugurthi @Yugansh5013 Thanks for the excellent architectural feedback! Moving the sanitisation to the producer side in chat_service.py makes complete sense, so we don't leak raw data into other paths like _generate_search_query_from_logs(). I'll revert the change in prompt_builder.py, implement the sanitiser at the entry point in chat_service.py, and add the requested unit test. I'll push the updated commits shortly! |
6451d3d to
24d1afd
Compare
Description
Fixes #265
This PR addresses a security vulnerability where user-pasted Jenkins build logs were injected directly into the LLM prompt without sanitization. Although
sanitize_logs()existed in the codebase, it was not being utilized in the production pipeline.Changes
sanitize_logsfromapi.tools.sanitizerintoprompt_builder.py.build_prompt()to processlog_contextthrough the sanitizer before string interpolation.Verification Results
I verified this fix locally by creating a test script that passed a "poisoned" log containing mock AWS keys and passwords.
Before fix: Secrets were visible in the generated prompt string.
After fix: All sensitive patterns were successfully replaced with
[REDACTED]or[REDACTED_AWS_KEY].Checklist
sanitize_logshandles common secret patterns (AWS, Passwords, Tokens)