Skip to content

fix(Federated Slack): Persist Document Set for Federated Connectors#5442

Merged
justin-tahara merged 2 commits intomainfrom
jtahara/fix-federated-slack-document-set-bug
Sep 17, 2025
Merged

fix(Federated Slack): Persist Document Set for Federated Connectors#5442
justin-tahara merged 2 commits intomainfrom
jtahara/fix-federated-slack-document-set-bug

Conversation

@justin-tahara
Copy link
Contributor

@justin-tahara justin-tahara commented Sep 17, 2025

Description

[Provide a brief description of the changes in this PR]
A customer was noticing that the Federated Slack Connector being the only connector in the document set would cause the document set to get deleted or disappear immediately. This is a bug in which we deleted any document set that was just a federated slack connector since it wouldn't have a connector_credential_pair. This PR aims to fix this and allow the Document Set to persist if it is only a Federated Slack Connector.

How Has This Been Tested?

[Describe the tests you ran to verify your changes]
Tested this locally as well as created tests to ensure that this is tracked and checked when we have any new Federated Connectors.

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

Summary by cubic

Fixes deletion of document sets that only include Federated connectors (e.g., Federated Slack). These sets now persist without connector_credential_pairs; only truly empty sets are deleted.

  • Bug Fixes
    • Updated monitor_document_set_taskset to keep document sets alive if federated_connectors are present; delete only when both connector_credential_pairs and federated_connectors are empty.
    • Added unit tests for federated-only persistence and empty-set deletion.

@justin-tahara justin-tahara requested a review from a team as a code owner September 17, 2025 19:05
@vercel
Copy link

vercel bot commented Sep 17, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
internal-search Ready Ready Preview Comment Sep 17, 2025 8:06pm

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR fixes a critical bug in the document set cleanup logic where document sets containing only federated Slack connectors were being incorrectly deleted. The issue occurred because the existing cleanup code in monitor_document_set_taskset only checked for connector_credential_pairs when determining whether to preserve a document set, but federated connectors are stored separately in a federated_connectors field.

The fix adds a check for federated connectors alongside the existing connector-credential pair check. The implementation uses getattr() with a default empty list to defensively handle cases where the federated_connectors attribute might not exist on document set models. This approach ensures backward compatibility and graceful handling during feature rollouts.

The change modifies the deletion condition from a simple "no connector-credential pairs" check to "no connector-credential pairs AND no federated connectors", ensuring document sets remain active when they contain either type of connector. Comprehensive unit tests were added to verify both scenarios: document sets with only federated connectors (should be preserved) and document sets with no connectors at all (should be deleted).

This fix integrates well with Onyx's connector architecture, which supports multiple connector types including traditional connectors that create connector-credential pairs and newer federated connectors that operate differently. The solution maintains the existing cleanup behavior for empty document sets while properly supporting the federated connector model.

Confidence score: 4/5

  • This PR is safe to merge with minimal risk as it fixes a clear bug without introducing breaking changes
  • Score reflects a well-targeted fix with defensive programming and comprehensive test coverage
  • Pay close attention to the test file to ensure proper mock setup and edge case coverage

2 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

Copy link
Contributor

@jessicasingh7 jessicasingh7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files

Prompt for AI agents (all 1 issues)

Understand the root cause of the following 1 issues and fix them.


<file name="backend/onyx/background/celery/tasks/vespa/tasks.py">

<violation number="1" location="backend/onyx/background/celery/tasks/vespa/tasks.py:420">
Unnecessary getattr fallback; DocumentSet.federated_connectors exists. Use direct attribute access to avoid masking errors and to keep type safety.</violation>
</file>

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

has_connector_pairs = bool(document_set.connector_credential_pairs)
# Federated connectors should keep a document set alive even without cc pairs.
has_federated_connectors = bool(
getattr(document_set, "federated_connectors", [])
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary getattr fallback; DocumentSet.federated_connectors exists. Use direct attribute access to avoid masking errors and to keep type safety.

Prompt for AI agents
Address the following comment on backend/onyx/background/celery/tasks/vespa/tasks.py at line 420:

<comment>Unnecessary getattr fallback; DocumentSet.federated_connectors exists. Use direct attribute access to avoid masking errors and to keep type safety.</comment>

<file context>
@@ -414,8 +414,14 @@ def monitor_document_set_taskset(
+        has_connector_pairs = bool(document_set.connector_credential_pairs)
+        # Federated connectors should keep a document set alive even without cc pairs.
+        has_federated_connectors = bool(
+            getattr(document_set, &quot;federated_connectors&quot;, [])
+        )
+
</file context>
Fix with Cubic

@justin-tahara justin-tahara merged commit da7dc33 into main Sep 17, 2025
52 of 54 checks passed
@justin-tahara justin-tahara deleted the jtahara/fix-federated-slack-document-set-bug branch September 17, 2025 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants