Skip to content

Conversation

@evanscastonguay
Copy link

@evanscastonguay evanscastonguay commented Jan 10, 2026

User description

Minimal Jira issue provider integration: issue provider abstraction (Jira/GitLab/GitHub), /similar_issue Jira support, ticket compliance Jira/GitLab path, Jira ADF parsing, embedding robustness, and unit tests. No deploy/values changes.


PR Type

Enhancement, Tests


Description

  • Add Jira issue provider support with abstraction layer for GitHub/GitLab/Jira

  • Implement /similar_issue tool for GitLab and Jira with vector DB integration

  • Add ticket compliance checking for Jira and GitLab issue providers

  • Support flexible embedding client with OpenAI-compatible endpoints

  • Improve GitLab provider robustness for issue handling and clone URLs


Diagram Walkthrough

flowchart LR
  A["Issue Provider Abstraction"] --> B["Jira Provider"]
  A --> C["GitHub Provider"]
  A --> D["GitLab Provider"]
  B --> E["/similar_issue Tool"]
  C --> E
  D --> E
  E --> F["Vector DB Integration"]
  F --> G["Pinecone/LanceDB/Qdrant"]
  E --> H["Embedding Client"]
  H --> I["OpenAI-compatible Endpoint"]
  J["Ticket Compliance Check"] --> B
  J --> C
  J --> D
Loading

File Walkthrough

Relevant files
Enhancement
11 files
ticket_utils.py
Add Jira ticket key extraction utility                                     
+16/-0   
__init__.py
Create issue provider module exports                                         
+16/-0   
base.py
Define abstract issue provider interface                                 
+46/-0   
github_issue_provider.py
Implement GitHub issue provider adapter                                   
+21/-0   
gitlab_issue_provider.py
Implement GitLab issue provider adapter                                   
+20/-0   
jira_issue_provider.py
Implement Jira issue provider with ADF parsing                     
+220/-0 
resolver.py
Add issue provider resolution and factory logic                   
+53/-0   
embedding_client.py
Add OpenAI-compatible embedding client                                     
+75/-0   
gitlab_provider.py
Improve GitLab provider issue handling and clone URL robustness
+92/-19 
pr_similar_issue.py
Refactor /similar_issue tool for multi-provider support   
+374/-124
ticket_pr_compliance_check.py
Add Jira and GitLab ticket compliance extraction                 
+128/-33
Tests
4 files
test_issue_provider_resolver.py
Add issue provider resolver unit tests                                     
+17/-0   
test_jira_issue_provider.py
Add Jira issue provider unit tests                                             
+94/-0   
test_similar_issue_helpers.py
Add /similar_issue helper function tests                                 
+49/-0   
test_ticket_pr_compliance_check.py
Add ticket compliance check integration tests                       
+113/-0 
Documentation
2 files
fetching_ticket_context.md
Document Jira issue provider configuration                             
+10/-0   
similar_issues.md
Document GitLab and Jira /similar_issue support                   
+69/-3   
Configuration changes
1 files
configuration.toml
Add Jira and embedding configuration options                         
+39/-0   

Evans Castonguay added 11 commits January 10, 2026 16:10
(cherry picked from commit bfc288d0bd7e8277c7dc8f7033e6526c8cc308e6)
(cherry picked from commit 5c0ed614536b1ef7e118716f2b567d8b9fe1e87f)
(cherry picked from commit bbaec74feeafb31cc5794c0a82332792fd8bccb6)
(cherry picked from commit f90435f36a800a71529bf8e4fa7917f2375510c0)
(cherry picked from commit 2317af75758cca9c37f3916b2fbef9b27374a7c1)
(cherry picked from commit ef4a8a7930cbd39515a3010fff2a1ca66f8dd4fa)
(cherry picked from commit 358e5974d49ec4676154e669e68744e2c23a6c9c)
(cherry picked from commit ea28ff7)
(cherry picked from commit 98769ce)
@qodo-free-for-open-source-projects
Copy link
Contributor

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🔴
Token exposure in URL

Description: The access token is embedded directly into the clone URL string without proper
sanitization. If this URL is logged, displayed in error messages, or stored in version
control, the token could be exposed.
gitlab_provider.py [1030-1030]

Referred Code
    return f"{parsed.scheme}://oauth2:{access_token}@{netloc}{parsed.path}"
except Exception as exc:
API key exposure risk

Description: The OpenAI API key is directly assigned from settings without validation or sanitization,
potentially exposing sensitive credentials in logs or error messages if the embedding
client or OpenAI library logs request details.
pr_similar_issue.py [468-470]

Referred Code
openai.api_key = get_settings().openai.key
res = openai.Embedding.create(input=list_to_encode, engine=self.embedding_model)
return [record['embedding'] for record in res['data']]
Credential exposure in headers

Description: Jira API credentials (email and token) are concatenated and base64-encoded without
validation, then included in HTTP headers. If these credentials are logged or exposed
through error messages, they could be compromised.
jira_issue_provider.py [91-92]

Referred Code
auth_token = base64.b64encode(f"{self.api_email}:{self.api_token}".encode("utf-8")).decode("utf-8")
request = urllib.request.Request(url)
API key in request headers

Description: The API key is included in the Authorization header without validation. If the requests
library or application logs HTTP headers, the bearer token could be exposed in logs.
embedding_client.py [35-35]

Referred Code
headers["Authorization"] = f"Bearer {self.api_key}"
Missing SSL verification

Description: The Jira API request uses urllib.request.urlopen without certificate verification
configuration. This could allow man-in-the-middle attacks if the HTTPS connection is not
properly validated, potentially exposing API credentials and sensitive issue data.
jira_issue_provider.py [96-102]

Referred Code
    with urllib.request.urlopen(request, timeout=self.timeout_seconds) as response:
        payload = response.read().decode("utf-8")
        return json.loads(payload)
except Exception as exc:
    if not suppress_warning:
        get_logger().warning("Failed to fetch Jira issues", artifact={"error": str(exc), "url": url})
    return {}
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Consistent Naming Conventions

Objective: All new variables, functions, and classes must follow the project's established naming
standards

Status: Passed

No Dead or Commented-Out Code

Objective: Keep the codebase clean by ensuring all submitted code is active and necessary

Status: Passed

Robust Error Handling

Objective: Ensure potential errors and edge cases are anticipated and handled gracefully throughout
the code

Status: Passed

When relevant, utilize early return

Objective: In a code snippet containing multiple logic conditions (such as 'if-else'), prefer an
early return on edge cases than deep nesting

Status: Passed

Single Responsibility for Functions

Objective: Each function should have a single, well-defined responsibility

Status:
Large function scope: The init method in PRSimilarIssue handles multiple concerns including initialization,
context resolution, embedding setup, and vector database configuration which may violate
single responsibility principle

Referred Code
def __init__(self, issue_url: str, ai_handler, args: list = None):
    self.issue_url = issue_url
    self.resource_url = issue_url.split('=')[-1] if issue_url else ""
    self.provider_name = get_settings().config.git_provider
    self.issue_provider_name = resolve_issue_provider_name(
        get_settings().get("CONFIG.ISSUE_PROVIDER", "auto"),
        self.provider_name,
    )
    self.supported = self.provider_name in ("github", "gitlab")
    self.git_provider = get_git_provider_with_context(self.resource_url)
    if not self.supported:
        return

    self._init_embedding_settings()
    self.repo_obj = None
    self.issue_iid = None
    self.project_path = None
    self.issue_context = False
    self.output_target = None
    self.issue_provider = None
    self.jira_keys = []


 ... (clipped 14 lines)
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-free-for-open-source-projects
Copy link
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix logic for limiting ticket links

Fix the logic for limiting ticket links in
extract_ticket_links_from_pr_description. Move the check to the beginning of the
loop and break when the limit is reached to correctly enforce it.

pr_agent/tools/ticket_pr_compliance_check.py [58-86]

 def extract_ticket_links_from_pr_description(pr_description, repo_path, base_url_html='https://github.com'):
     """
     Extract all ticket links from PR description
     """
     ticket_links = set()
     try:
         # Use the updated pattern to find matches
         matches = ISSUE_LINK_PATTERN.findall(pr_description)
 
         for match in matches:
+            if len(ticket_links) >= 3:
+                get_logger().info(f"Found more than 3 tickets in PR description, limiting to 3.")
+                break
+
             if match[0]:  # Full URL match
                 ticket_links.add(match[0])
             elif match[1]:  # Shorthand notation match: owner/repo#issue_number
                 owner, repo, issue_number = match[2], match[3], match[4]
                 ticket_links.add(f'{base_url_html.strip("/")}/{owner}/{repo}/issues/{issue_number}')
             else:  # #123 format
                 issue_number = match[5][1:]  # remove #
                 if issue_number.isdigit() and len(issue_number) < 5 and repo_path:
                     ticket_links.add(f'{base_url_html.strip("/")}/{repo_path}/issues/{issue_number}')
 
-            if len(ticket_links) > 3:
-                get_logger().info(f"Too many tickets found in PR description: {len(ticket_links)}")
-                # Limit the number of tickets to 3
-                ticket_links = set(list(ticket_links)[:3])
     except Exception as e:
         get_logger().error(f"Error extracting tickets error= {e}",
                            artifact={"traceback": traceback.format_exc()})
 
     return list(ticket_links)
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a logical flaw in how the number of ticket links is limited. The current implementation inside the loop is incorrect and does not guarantee the limit. The proposed change to check the count at the start of the loop and break is the correct way to enforce the limit, fixing a clear bug.

Medium
Improve error handling for comment fetching

Improve error handling in get_issue_comments by explicitly checking if self.mr
exists before fetching merge request comments and by adding a try...except block
when fetching comments for a specific issue.

pr_agent/git_providers/gitlab_provider.py [927-933]

 def get_issue_comments(self, issue=None):
     if issue is None:
+        if not self.mr:
+            get_logger().warning("No merge request context to get comments from.")
+            return []
         try:
             return self.mr.notes.list(get_all=True)[::-1]
-        except Exception:
+        except Exception as e:
+            get_logger().error(f"Failed to get merge request comments: {e}")
             return []
-    return list(issue.notes.list(iterator=True))
+    try:
+        return list(issue.notes.list(iterator=True))
+    except Exception as e:
+        get_logger().error(f"Failed to get issue comments: {e}")
+        return []
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies two potential failure points in the get_issue_comments function and proposes adding more robust error handling. Explicitly checking for self.mr and adding a try/except block for issue comment fetching makes the function more resilient and improves debuggability.

Medium
High-level
Use a dedicated Jira library

The suggestion recommends replacing the manual urllib-based Jira API
implementation with a dedicated library like jira-python. This would simplify
the code and improve robustness by abstracting away authentication, endpoint
management, and data parsing.

Examples:

pr_agent/issue_providers/jira_issue_provider.py [82-102]
    def _request_json(self, path: str, params: dict, api_version: Optional[int] = None, suppress_warning: bool = False) -> dict:
        if not self.is_configured():
            get_logger().warning("Jira client is not configured; skipping issue fetch")
            return {}
        query = urllib.parse.urlencode(params)
        version = api_version or self.api_version
        url = f"{self.base_url}/rest/api/{version}/{path}"
        if query:
            url = f"{url}?{query}"
        auth_token = base64.b64encode(f"{self.api_email}:{self.api_token}".encode("utf-8")).decode("utf-8")

 ... (clipped 11 lines)

Solution Walkthrough:

Before:

# pr_agent/issue_providers/jira_issue_provider.py
class JiraIssueProvider(IssueProvider):
    def _request_json(self, path, params, ...):
        url = f"{self.base_url}/rest/api/{self.api_version}/{path}"
        if params:
            url = f"{url}?{urllib.parse.urlencode(params)}"

        auth_token = base64.b64encode(f"{self.api_email}:{self.api_token}".encode()).decode()
        request = urllib.request.Request(url)
        request.add_header("Authorization", f"Basic {auth_token}")

        with urllib.request.urlopen(request) as response:
            payload = response.read().decode("utf-8")
            return json.loads(payload)

    def get_issue(self, issue_id, ...):
        data = self._request_json(f"issue/{issue_id}", ...)
        return self._issue_from_payload(data)

After:

# pr_agent/issue_providers/jira_issue_provider.py
from jira import JIRA

class JiraIssueProvider(IssueProvider):
    def __init__(self, ...):
        # ...
        self.client = JIRA(
            server=self.base_url,
            basic_auth=(self.api_email, self.api_token)
        )

    def get_issue(self, issue_id, ...):
        jira_issue = self.client.issue(issue_id)
        return self._issue_from_jira_object(jira_issue)

    def _issue_from_jira_object(self, jira_issue):
        return Issue(
            key=jira_issue.key,
            title=jira_issue.fields.summary,
            ...
        )
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that the Jira integration is built from scratch using urllib, and proposes a valid, more robust alternative using a dedicated library, which is a significant architectural improvement.

Medium
General
Improve error handling for API requests

Refactor the _request_json function to specifically handle
urllib.error.HTTPError and log the status code, improving debugging for Jira API
requests.

pr_agent/issue_providers/jira_issue_provider.py [82-102]

 def _request_json(self, path: str, params: dict, api_version: Optional[int] = None, suppress_warning: bool = False) -> dict:
     if not self.is_configured():
         get_logger().warning("Jira client is not configured; skipping issue fetch")
         return {}
     query = urllib.parse.urlencode(params)
     version = api_version or self.api_version
     url = f"{self.base_url}/rest/api/{version}/{path}"
     if query:
         url = f"{url}?{query}"
     auth_token = base64.b64encode(f"{self.api_email}:{self.api_token}".encode("utf-8")).decode("utf-8")
     request = urllib.request.Request(url)
     request.add_header("Authorization", f"Basic {auth_token}")
     request.add_header("Accept", "application/json")
     try:
         with urllib.request.urlopen(request, timeout=self.timeout_seconds) as response:
             payload = response.read().decode("utf-8")
             return json.loads(payload)
+    except urllib.error.HTTPError as exc:
+        if not suppress_warning:
+            get_logger().warning(
+                "Failed to fetch Jira issues due to HTTP error",
+                artifact={"error": str(exc), "status_code": exc.code, "url": url},
+            )
+        return {}
     except Exception as exc:
         if not suppress_warning:
             get_logger().warning("Failed to fetch Jira issues", artifact={"error": str(exc), "url": url})
         return {}
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly points out that handling specific HTTP errors is better than a generic except Exception. This improves logging and debuggability for network requests to the Jira API, which is a valuable improvement for robustness.

Low
Optimize the individual embedding fallback logic

Optimize the fallback logic in _embed_texts_with_fallback by directly calling
the embedding client for individual texts, avoiding the overhead of using the
batch-oriented wrapper in a loop.

pr_agent/tools/pr_similar_issue.py [472-485]

 def _embed_texts_with_fallback(self, list_to_encode: list[str]) -> tuple[list[list[float]], list[int]]:
     try:
         return self._embed_texts(list_to_encode), list(range(len(list_to_encode)))
     except Exception:
         get_logger().error('Failed to embed entire list, embedding one by one...')
         embeds = []
         successful_indices = []
         for idx, text in enumerate(list_to_encode):
             try:
-                embeds.append(self._embed_texts([text])[0])
+                if self.embedding_client:
+                    embedding = self.embedding_client.embed([text])[0]
+                else:
+                    openai.api_key = get_settings().openai.key
+                    res = openai.Embedding.create(input=[text], engine=self.embedding_model)
+                    embedding = res['data'][0]['embedding']
+                embeds.append(embedding)
                 successful_indices.append(idx)
             except Exception:
                 get_logger().warning("Failed to embed text segment; skipping.", artifact={"index": idx})
         return embeds, successful_indices
  • Apply / Chat
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly identifies a minor performance inefficiency in the error-handling path for embedding texts. While the proposed fix is valid, it introduces code duplication from the _embed_texts method, and the performance gain is likely minimal as it only affects a fallback scenario.

Low
Learned
best practice
Capture exception objects for logging

The exception is captured as 'exc' but the code references it correctly.
However, in the _normalize_description method, there's a bare 'except
Exception:' without capturing the exception object, which prevents proper error
logging if needed.

pr_agent/issue_providers/jira_issue_provider.py [181-184]

 try:
-    with urllib.request.urlopen(request, timeout=self.timeout_seconds) as response:
-        payload = response.read().decode("utf-8")
-        return json.loads(payload)
-except Exception as exc:
-    if not suppress_warning:
-        get_logger().warning("Failed to fetch Jira issues", artifact={"error": str(exc), "url": url})
-    return {}
+    return str(description)
+except Exception as e:
+    get_logger().debug(f"Failed to convert description to string: {e}")
+    return ""
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why:
Relevant best practice - When catching exceptions in try-except blocks, always capture the exception object using 'as e' syntax (e.g., 'except TypeError as e:') to enable proper error logging and debugging. This prevents NameError when trying to reference the exception in the except block.

Low
  • More
  • Author self-review: I have reviewed the PR code suggestions, and addressed the relevant ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant