Skip to content

Conversation

@Yugansh5013
Copy link
Contributor

Title: feat: Add Context Observer & Proactive Log Analysis for Build Failures

This PR introduces a "Proactive Troubleshooting" feature that detects when a user is viewing a failed build log and offers immediate AI analysis. It also includes backend optimizations to ensure the LLM prioritizes user logs over generic documentation.

Key Changes:

1. Frontend (Context Observer):

  • Implemented a custom hook useContextObserver that monitors the URL (checking for /console) and scroll position.
  • Added a ProactiveToast component that appears non-intrusively when a user pauses at the bottom of a build log.
  • Implemented log scraping logic that captures the last 5,000 characters of the .console-output element to provide context to the LLM.

2. Backend (Log Analysis Logic):

  • Log Detection: Updated chat_service.py with a Regex pattern to detect when a message contains scraped logs.
  • Prompt Engineering: Introduced a dedicated LOG_ANALYSIS_INSTRUCTION in prompts.py.
  • Priority Logic: Updated prompt_builder.py to dynamically switch system instructions. If raw logs are detected, the system switches to an "Expert Debugger" persona that strictly prioritizes user logs over retrieved RAG documentation, preventing hallucinations based on irrelevant context.

Testing done

Manual Integration Testing:
I created a test Jenkins job (test-fail-job) configured to execute a batch command exit 1 to simulate a failure.

  1. Observer Activation: Verified that the "Jenkins Assistant" toast appears only when scrolling to the bottom of the console page on a failed build.
  2. Context Injection: Clicked "Yes, analyze" and verified that the chat window opened with the input correctly pre-filled with the last 5000 characters of the log.
  3. Response Accuracy: Verified that the Bot correctly identified the exit 1 error.
    • Before Backend Changes: The Bot would often quote unrelated RAG documents (e.g., Cppcheck warnings) found in the vector database.
    • After Backend Changes: With the new LOG_ANALYSIS_INSTRUCTION, the Bot correctly ignored the irrelevant documentation and identified the specific batch failure from the provided logs.

Fixes Issue #77

Screenshots:
image
image

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests that demonstrate the feature works or the issue is fixed

Copilot AI review requested due to automatic review settings January 1, 2026 19:44
@Yugansh5013 Yugansh5013 requested a review from a team as a code owner January 1, 2026 19:44
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a proactive troubleshooting feature that automatically detects when users view failed build logs and offers AI-powered log analysis. The implementation includes a frontend context observer that monitors user behavior and a backend system that intelligently prioritizes user-provided logs over generic documentation.

Key Changes:

  • Added a context-aware observer hook that detects console page visits and triggers a toast notification after users scroll to the bottom of build logs
  • Implemented log pattern detection and parsing in the backend to separate log data from user queries, enabling specialized log analysis prompts
  • Enhanced prompt engineering with a dedicated LOG_ANALYSIS_INSTRUCTION that prioritizes user logs over RAG-retrieved documentation

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
frontend/src/utils/useContextObserver.ts Custom hook that monitors URL and scroll position to detect when users are viewing console logs
frontend/src/components/Toast.tsx Proactive toast component offering log analysis assistance
frontend/src/components/Chatbot.tsx Integration of observer and toast, with log scraping functionality to extract console output
frontend/src/styles/styles.ts Styling definitions for the toast notification component
chatbot-core/api/services/chat_service.py Log detection via regex pattern and separation of logs from queries for specialized handling
chatbot-core/api/prompts/prompts.py New LOG_ANALYSIS_INSTRUCTION prompt for expert log analysis persona
chatbot-core/api/prompts/prompt_builder.py Dynamic prompt switching based on presence of log context
chatbot-core/rag/embedding/bm25_indexer.py Added if __name__ == "__main__" guard to prevent unintended execution during imports

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@cnu1812
Copy link
Member

cnu1812 commented Jan 2, 2026

There are some issues with this pr, so, I cannot merge this yet due to a Critical Security Blocker and an architectural concern regarding the RAG pipeline efficiency. Jenkins console logs are notorious for accidentally leaking secrets (API keys, passwords, internal IPs). Your PR currently scrapes the last 5,000 characters and sends them raw to the LLM.

So, If a user has a line like + docker login -p <REAL_PASSWORD>, we are sending that password to the inference engine. Even if local, this is bad practice; if we swap the provider to OpenAI later, we just leaked a secret. So, Please add a sanitization utility function (regex-based) in chat_service.py before the prompt construction. I need to see a Unit Test proving that a dummy API key in the log gets masked to ****.

Also, "With the new LOG_ANALYSIS_INSTRUCTION, the Bot correctly ignored the irrelevant documentation."

This implies that we are still performing a Vector Search using the 5,000 characters of logs as the query, but simply telling the LLM to ignore the bad results. Searching FAISS with a 5,000-char stack trace is computationally expensive and yields "garbage" results (noise).
So, If the "Expert Debugger" mode is active, we should either:

  • Skip RAG entirely: If the user just wants the log explained, we don't need to search the docs.
  • Generate a Query: Use the LLM to extract the error signature first, and use that small string to search the vector DB.

@Yugansh5013
Copy link
Contributor Author

Update:

I have addressed the feedback with two key improvements:

  1. Sanitizer Implementation:

    • Created api/utils/sanitizer.py to automatically strip sensitive data (passwords, tokens, API keys) from logs before processing.
    • Added unit tests to verify that secrets are redacted correctly.
  2. Vector Search Optimization:

    • Enhanced the search mechanism to extract a concise "error signature" (e.g., specific error codes or messages) from logs using the LLM.
    • We now perform Vector Search using this targeted signature rather than the entire raw log file, significantly improving search relevance and efficiency.

Ready for review!

@cnu1812
Copy link
Member

cnu1812 commented Jan 3, 2026

I have one question regarding the new Error Signature step:

"extract a concise 'error signature'... from logs using the LLM"

Does this introduce a noticeable latency penalty? (i.e., Are we doing a full LLM round-trip just to get the search query before generating the final answer?)

  • If yes: Is it blocking the UI?
  • If no: (e.g., you are using a lighter regex/heuristic or a very small model call), then perfect.

So, Please post a quick "Before vs. After" timing benchmark (e.g., "Old: 2.5s -> New: 3.2s"). If the added delay is under 1-2 seconds, I think the accuracy gain is worth it.

@berviantoleo berviantoleo added the enhancement For changelog: Minor enhancement. use `major-rfe` for changes to be highlighted label Jan 3, 2026
@berviantoleo berviantoleo linked an issue Jan 3, 2026 that may be closed by this pull request
@Yugansh5013
Copy link
Contributor Author

I have one question regarding the new Error Signature step:

"extract a concise 'error signature'... from logs using the LLM"

Does this introduce a noticeable latency penalty? (i.e., Are we doing a full LLM round-trip just to get the search query before generating the final answer?)

  • If yes: Is it blocking the UI?
  • If no: (e.g., you are using a lighter regex/heuristic or a very small model call), then perfect.

So, Please post a quick "Before vs. After" timing benchmark (e.g., "Old: 2.5s -> New: 3.2s"). If the added delay is under 1-2 seconds, I think the accuracy gain is worth it.

Valid point regarding the latency.

Clarification: The "Error Signature" step does introduce a full LLM round-trip. On my local development machine (CPU-only laptop), this is noticeable, but it scales significantly better with proper hardware.

Benchmark (Local CPU-only Environment):

Penalty Added: ~41s (Time for the extra LLM inference step)

Total Latency: ~135s (vs ~95s baseline for standard queries)

Interpretation: While ~41s is blocking on a laptop CPU, this purely reflects the inference speed of the local model for that specific prompt. On a production setup with a GPU (or using a faster provider), this extra step would likely add only 1–2 seconds.

Given that the previous Regex-only approach often failed on complex or unstructured stack traces (leading to "garbage-in, garbage-out" retrieval), I believe the accuracy gain is worth the latency trade-off.

@cnu1812
Copy link
Member

cnu1812 commented Jan 5, 2026

a +41s penalty (bringing total wait time to ~2 mins on CPU) is a UX risk. Users might think the plugin has crashed.

So, do these,

  • UX Check: Please ensure your frontend ProactiveToast or loading state explicitly updates the text to say something like: "Analyzing stack trace patterns..." during that specific extraction phase. The user needs to know why it's hanging.

  • And New Issue: Please open a follow-up issue titled "[Performance] Optimize Error Signature Extraction".

@Yugansh5013
Copy link
Contributor Author

Agreed, a ~2 min wait on CPU is definitely a bad user experience; users will assume it hung.

  1. UX Check: I'll implement the regex check on the frontend side. If logs are detected, I'll force the loading state to explicitly say "Analyzing stack trace patterns..." so the user knows exactly what's happening during that pause.
  2. New Issue: I'll open the [Performance] Optimize Error Signature Extraction issue to track future optimizations (like using a smaller model or hybrid approach).

Quick Question:
To fix the build errors (specifically R0914: Too many local variables), I had to refactor the main get_chatbot_reply function in chat_service.py by extracting the log processing and file handling logic into helper functions (_process_log_input, etc.). Since this modifies the structure of the original code, are you okay with including this refactoring in this PR?

Working on the UX changes now!

@cnu1812
Copy link
Member

cnu1812 commented Jan 5, 2026

Yes, absolutely. Since adding the Log Analysis logic likely pushed get_chatbot_reply over the complexity threshold, cleaning it up is part of this PR's scope. We definitely prefer extracting helper functions (_process_log_input, etc.) over just suppressing the linter warning!

@Yugansh5013
Copy link
Contributor Author

I've refactored get_chatbot_reply in chat_service.py by extracting the file handling and memory formatting logic into helper functions, which resolved the complexity warning.

I also implemented the context-aware UI loading state we discussed and fixed the resulting frontend test failures. The build should be green now.

Please let me know if there's anything else you'd like me to address, or if you have any further concerns!

here are the screenshots of the new waiting messages.
Normal Prompt:-
Screenshot 2026-01-05 163537

Prompt with logs:-
Screenshot 2026-01-05 163927

@cnu1812
Copy link
Member

cnu1812 commented Jan 6, 2026

I'm kinda occupied, give me some time to review.

@cnu1812 cnu1812 requested a review from berviantoleo January 16, 2026 06:18
Copy link
Contributor

@berviantoleo berviantoleo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing isLoading to loadingStatus is not necessary for some components.

Please keep the isLoading and pass the loadingStatus only for the required components.

We can minimize the changes with it.

@Yugansh5013 Yugansh5013 force-pushed the feature/context-aware-UI branch from ae6b737 to 8bb7709 Compare January 16, 2026 18:08
@Yugansh5013
Copy link
Contributor Author

Changing isLoading to loadingStatus is not necessary for some components.

Please keep the isLoading and pass the loadingStatus only for the required components.

I've addressed the feedback from the review:

  • Architecture Update: I reverted the logic to keep isLoading (boolean) for control flow (disabling buttons, etc.) while using loadingStatus (string) exclusively for the status text, as requested.
  • New Component: I extracted the loading animation into a dedicated LoadingDots.tsx component.
  • Style Clean-up: I removed the inline styles (including the animation delays) from Messages.tsx and moved all specific dot styles into styles.ts.

All tests have been updated to reflect these changes and are passing.

@Yugansh5013
Copy link
Contributor Author

hey! @cnu1812 I have done the changes asked by @berviantoleo, please review this.

@berviantoleo berviantoleo requested a review from cnu1812 January 21, 2026 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement For changelog: Minor enhancement. use `major-rfe` for changes to be highlighted

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[UX] Proactive "Context-Aware" Suggestions

3 participants