Skip to content

feat(slack bot): add federated search#5275

Merged
jessicasingh7 merged 58 commits intomainfrom
jessica/DAN-2328-support-for-slack-search
Sep 23, 2025
Merged

feat(slack bot): add federated search#5275
jessicasingh7 merged 58 commits intomainfrom
jessica/DAN-2328-support-for-slack-search

Conversation

@jessicasingh7
Copy link
Contributor

@jessicasingh7 jessicasingh7 commented Aug 27, 2025

Description

DAN-2328

How Has This Been Tested?

[Describe the tests you ran to verify your changes]

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

Summary by cubic

Adds federated Slack search to the Slack bot using a user OAuth token, enabling public-channel search results in chat. This delivers the core of DAN-2328 for public channels; channel scoping and private/DM coverage will follow.

  • New Features

    • Added user_token to Slack bot (DB, models, API, UI) and server-side validation (requires xoxp- and Slack auth.test).
    • Slack OAuth now requests both bot scopes and user scopes; callback returns bot access_token and user_token.
    • Federated retrieval uses Slack search in bot context when an enabled Slack bot has a user_token.
    • Slack retrieval uses search.messages for public channels, improves logging, username fallback, and skips thread expansion with user tokens to avoid extra scopes.
    • New SlackSearch tool for the bot to fetch and return Slack snippets.
    • Small fixes: timezone handling in Slack blocks; typing/instance checks.
  • Migration

    • Run backend DB migrations.
    • In Slack: add user scopes search:read.public and channels:history, then reinstall to obtain a user OAuth token.
    • In Admin > Bots, add the optional “Slack User Token” (xoxp-). Without it, bot-context federated Slack search is disabled.

@vercel
Copy link

vercel bot commented Aug 27, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
internal-search Ready Ready Preview Comment Sep 23, 2025 1:33am

Copy link
Contributor

@Weves Weves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jessicasingh7 overall looks great! Left a few initial comments, let me know if you have any questions about them.

@jessicasingh7 jessicasingh7 marked this pull request as ready for review September 23, 2025 00:23
@jessicasingh7 jessicasingh7 requested a review from a team as a code owner September 23, 2025 00:23
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR implements federated Slack search functionality for the Slack bot, enabling it to search through Slack messages directly and include those results in chat responses. The implementation adds support for user OAuth tokens (xoxp- prefixed) in addition to existing bot tokens, as user tokens provide broader search permissions required for accessing Slack's search API.

The changes span multiple layers of the application:

Database & Models: A new nullable user_token field is added to the SlackBot model with proper encryption, along with corresponding API models (SlackBotCreationRequest, SlackBotTokens) and validation functions that ensure user tokens follow the correct format and are valid with Slack's API.

Frontend Integration: The admin UI is updated to include an optional "Slack User Token" field in bot configuration forms, maintaining backward compatibility since not all bots require federated search capabilities.

Search Pipeline Integration: The core search system is enhanced with SlackContext awareness - a new model that captures channel type (public/private/DM), channel ID, and user ID. This context flows through the entire search pipeline from the Slack listener through chat processing to the search tools, enabling context-aware federated searches.

Federated Search Implementation: A new slack_search.py module implements the actual Slack search functionality using Slack's search.messages API. It includes sophisticated channel filtering logic (e.g., restricting private channel access in bot contexts), thread contextualization for better results, and proper conversion of Slack search results to the system's InferenceChunk format.

OAuth Flow Enhancement: The Slack OAuth connector is updated to handle both bot and user tokens correctly, extracting user tokens from the authed_user section of Slack's OAuth response rather than incorrectly using the same token for both purposes.

The architecture prioritizes Slack bot context over regular user-based federated retrieval, allowing bots to function autonomously while falling back to user-based search when appropriate. Channel type detection uses Slack's API rather than potentially unreliable event data, ensuring accurate permission handling.

Confidence score: 3/5

  • This PR introduces complex federated search functionality with multiple integration points that could have subtle permission or security issues
  • Score reflects the architectural complexity and potential for runtime issues with lambda closures and token validation edge cases
  • Pay close attention to backend/onyx/federated_connectors/federated_retrieval.py and the Slack search implementation files for variable capture bugs and permission logic

Context used:

Rule - Remove temporary debugging code before merging to production, especially tenant-specific debugging logs. (link)

32 files reviewed, 12 comments

Edit Code Review Bot Settings | Greptile

Args:
user_token: The user OAuth token to validate.
Returns:
None is valid and will return successfully.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Consider updating docstring - says 'None is valid and will return successfully' but should say 'Returns None if valid'

Suggested change
None is valid and will return successfully.
Returns None if valid.

try:
channel_info = web_client.conversations_info(channel=channel_id)
if channel_info.get("ok") and channel_info.get("channel"):
channel: dict[str, Any] = channel_info.get("channel", {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: The variable assignment channel: dict[str, Any] = channel_info.get("channel", {}) is redundant since you're already checking channel_info.get("channel") on line 94

Comment on lines 149 to 152
if "not_allowed_token_type" in str(slack_error):
logger.error(
f"TOKEN TYPE ERROR: access_token starts with: {access_token[:10]}..."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Logging the first 10 characters of access_token could potentially expose sensitive information in logs. Consider logging just the token type prefix (xoxp- vs xoxb-) instead.

Comment on lines +91 to +98
retrieval_function=lambda query: connector.search(
query,
{}, # Empty entities for Slack context
access_token=access_token,
limit=MAX_FEDERATED_CHUNKS,
slack_event_context=slack_context,
bot_token=tenant_slack_bot.bot_token,
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Lambda captures variables by reference. access_token and tenant_slack_bot.bot_token could change between lambda creation and execution, causing runtime errors. Use default parameters to capture by value: lambda query, token=access_token, bot_token=tenant_slack_bot.bot_token: ...

Suggested change
retrieval_function=lambda query: connector.search(
query,
{}, # Empty entities for Slack context
access_token=access_token,
limit=MAX_FEDERATED_CHUNKS,
slack_event_context=slack_context,
bot_token=tenant_slack_bot.bot_token,
),
retrieval_function=lambda query, token=access_token, bot_token=tenant_slack_bot.bot_token: connector.search(
query,
{}, # Empty entities for Slack context
access_token=token,
limit=MAX_FEDERATED_CHUNKS,
slack_event_context=slack_context,
bot_token=bot_token,
),

Comment on lines +90 to +92
# Slack-specific parameters
slack_event_context: SlackContext | None = None,
bot_token: str | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: The Slack-specific parameters break interface abstraction by adding connector-specific parameters to a generic interface. Consider using a more extensible approach like a generic context parameter or connector-specific method overloads.


# Ensure that the message is a new message of expected type
event_type = event.get("type")
event.get("channel_type")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: This line calls event.get("channel_type") but doesn't assign the result to anything. Either remove this line or assign it to a variable if it's needed.


return OAuthResult(
access_token=access_token,
access_token=user_token, # Bot token for bot operations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: Comment is misleading - this is actually returning the user token, not bot token as the comment suggests

Suggested change
access_token=user_token, # Bot token for bot operations
access_token=user_token, # User token for user operations


started_patches = [p.start() for p in patches]

self._setup_slack_api_mocks(started_patches[0], started_patches[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Bug: passing same mock (started_patches[0]) twice to _setup_slack_api_mocks. Second parameter should be different mock or removed.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cubic analysis

24 issues found across 33 files

Prompt for AI agents (all 24 issues)

Understand the root cause of the following 24 issues and fix them.


<file name="backend/onyx/connectors/slack/onyx_retry_handler.py">

<violation number="1" location="backend/onyx/connectors/slack/onyx_retry_handler.py:78">
Retry-After parsing uses only first character when header value is str, causing too-short backoff and repeated 429s.</violation>
</file>

<file name="backend/onyx/chat/process_message.py">

<violation number="1" location="backend/onyx/chat/process_message.py:606">
Passing slack_context causes SearchTool to log full SlackContext at info level (user_id/channel_id), risking PII exposure; prefer redaction or debug-level logging.</violation>
</file>

<file name="backend/onyx/federated_connectors/interfaces.py">

<violation number="1" location="backend/onyx/federated_connectors/interfaces.py:91">
Leaking Slack-specific type into the generic FederatedConnector.search signature reduces modularity; use a generic type (e.g., Any or a protocol) or a generic context dict instead.</violation>
</file>

<file name="backend/onyx/federated_connectors/slack/federated_connector.py">

<violation number="1" location="backend/onyx/federated_connectors/slack/federated_connector.py:187">
OAuthResult is assigning the user token to access_token, which mislabels tokens and will break downstream auth flows. Populate user_token with the user token and keep access_token for the bot token returned at the top level of the OAuth response.</violation>

<violation number="2" location="backend/onyx/federated_connectors/slack/federated_connector.py:241">
Avoid logging full entities at INFO; log minimal or redacted metadata to reduce potential leakage in logs.</violation>
</file>

<file name="backend/onyx/context/search/federated/slack_search.py">

<violation number="1" location="backend/onyx/context/search/federated/slack_search.py:60">
Calling conversations_info per match can be slow and hit rate limits; cache channel type by channel_id or prefetch to reduce API calls.</violation>

<violation number="2" location="backend/onyx/context/search/federated/slack_search.py:71">
On any exception during channel type lookup, all messages are filtered out, often resulting in zero results.</violation>

<violation number="3" location="backend/onyx/context/search/federated/slack_search.py:147">
Do not log entire API response objects; restrict logs to safe metadata (status, error).

(Based on your team&#39;s feedback about logging only safe metadata rather than raw responses that may include sensitive info.) [FEEDBACK_USED]</violation>

<violation number="4" location="backend/onyx/context/search/federated/slack_search.py:151">
Do not log any part of OAuth tokens; remove token prefix from logs.

(Based on your team&#39;s feedback about avoiding logging raw exception details/tokens and preferring safe metadata.) [FEEDBACK_USED]</violation>
</file>

<file name="backend/onyx/onyxbot/slack/utils.py">

<violation number="1" location="backend/onyx/onyxbot/slack/utils.py:87">
chat_postMessage result accessed with message_ts instead of ts, causing KeyError and retry-induced duplicate posts</violation>

<violation number="2" location="backend/onyx/onyxbot/slack/utils.py:101">
Private channel detection omits is_group; legacy private channels may be misclassified as UNKNOWN. Include is_group in the private-channel check.</violation>

<violation number="3" location="backend/onyx/onyxbot/slack/utils.py:113">
Avoid catching broad Exception; catch SlackApiError to handle Slack API errors specifically.</violation>

<violation number="4" location="backend/onyx/onyxbot/slack/utils.py:115">
Avoid logging the raw exception string; it may leak sensitive data. Log safe metadata or omit the exception value.

(Based on your team&#39;s feedback about avoiding logging raw exception strings that may include URLs with temporary auth tokens.) [FEEDBACK_USED]</violation>
</file>

<file name="backend/onyx/federated_connectors/models.py">

<violation number="1" location="backend/onyx/federated_connectors/models.py:43">
Mask sensitive access_token in model repr to prevent leaking tokens when objects are logged; add repr=False to Field.</violation>

<violation number="2" location="backend/onyx/federated_connectors/models.py:45">
Add repr=False to user_token Field to avoid exposing the user OAuth token in logs/prints.</violation>
</file>

<file name="backend/tests/external_dependency_unit/slack_bot/test_slack_bot_federated_search.py">

<violation number="1" location="backend/tests/external_dependency_unit/slack_bot/test_slack_bot_federated_search.py:248">
Add a test case where SlackBot has no user_token to validate the bot_token-only fallback and ensure federated search is correctly limited/disabled per requirements.</violation>

<violation number="2" location="backend/tests/external_dependency_unit/slack_bot/test_slack_bot_federated_search.py:290">
Avoid hardcoding embedding dimensionality in tests; derive from configuration or stub consumers to accept arbitrary length.</violation>

<violation number="3" location="backend/tests/external_dependency_unit/slack_bot/test_slack_bot_federated_search.py:529">
Docstring/expectations for DM behavior may be misleading; either update to reflect current implementation or add tests to cover optional inclusion of private channels when feasible. (According to linked Linear issue DAN-2328 optional DM behavior.)</violation>
</file>

<file name="backend/onyx/onyxbot/slack/listener.py">

<violation number="1" location="backend/onyx/onyxbot/slack/listener.py:870">
Avoid per-message Slack API calls for channel type; prefer using event-provided channel_type or caching to reduce latency and rate-limit risk.</violation>
</file>

<file name="backend/onyx/server/manage/validate_tokens.py">

<violation number="1" location="backend/onyx/server/manage/validate_tokens.py:71">
Add a timeout to the Slack API call to prevent hanging requests in case of network issues or Slack outages.</violation>
</file>

<file name="backend/onyx/utils/gpu_utils.py">

<violation number="1" location="backend/onyx/utils/gpu_utils.py:22">
Brittle disabled check: hard-codes port 9000; misses other ports, causing unnecessary requests when MODEL_SERVER_HOST is &#39;disabled&#39;.</violation>
</file>

<file name="backend/onyx/configs/constants.py">

<violation number="1" location="backend/onyx/configs/constants.py:10">
Overly strict user token prefix &quot;xoxp-&quot; causes validate_user_token to reject valid Slack rotated user tokens (e.g., &quot;xoxe.xoxp-...&quot;), blocking Slack bot setup per DAN-2328.</violation>
</file>

<file name=".github/workflows/pr-external-dependency-unit-tests.yml">

<violation number="1" location=".github/workflows/pr-external-dependency-unit-tests.yml:10">
Duplicate env key OPENAI_API_KEY is defined twice in the same env mapping; remove one to avoid confusion and potential YAML duplicate-key issues.</violation>

<violation number="2" location=".github/workflows/pr-external-dependency-unit-tests.yml:94">
Readiness check masks all failures by continuing on any non-zero status; only continue on actual timeout (exit code 124), fail otherwise.</violation>
</file>

Linked issue analysis

Linked issue: DAN-2328: Add support for slack search into the slack bot

Status Acceptance criteria Notes
If 'All Public Knowledge' is selected, then federated slack (if connected) should be used. No mapping from 'All Public Knowledge' selection to Slack retrieval
If 'Specific Document Sets' are selected, then federated slack (if assigned to any of the document sets) should be used. No document-set → Slack-channel selection used in retrieval
If 'Search Assistant' is selected, then federated slack (if available to that assistant) should be used. No assistant-type check enabling Slack federated retrieval found
If 'Non-Search' assistant is selected, then no federated search should be used. No logic disables federated retrieval for non-search assistants
Federated search should search all Public channels, but not include DMs / private channels. Channel filtering & public-channel detection implemented
If a document set is specified that specifies certain channels, then search those channels (okay to ignore private channels). No scoping of Slack search by document-set channel list implemented
Optional: If DMing the bot, then respond based off of all public channels + private channels + DMs that the user is a part of (including ephemeral responses). DM context handled via include_dm true and skiplists bypassed
Optional: Allow user to add the Slack bot to certain private channels to make them searchable. No UI/API to add bot to private channels found
Optional: If asking a question in a private channel, respond based off all public channels + that ONE specific private channel. Allowed_private_channel implemented to permit only that private channel
Admins/curators should be able to create document sets and attach specific Slack channels to them (instead of only 'All of Slack'). No document-set UI/API to attach specific Slack channels implemented
UI: In document set creation page, allow configuring channels (replace gear with 'All Channels' & popup to type channel names). No document-set UI changes to add channel configuration popup

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

ttl_ms: int | None = None

retry_after_value: list[str] | None = None
retry_after_value: list[str] | str | None = None
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retry-After parsing uses only first character when header value is str, causing too-short backoff and repeated 429s.

Prompt for AI agents
Address the following comment on backend/onyx/connectors/slack/onyx_retry_handler.py at line 78:

<comment>Retry-After parsing uses only first character when header value is str, causing too-short backoff and repeated 429s.</comment>

<file context>
@@ -75,7 +75,7 @@ def prepare_for_next_attempt(
         ttl_ms: int | None = None
 
-        retry_after_value: list[str] | None = None
+        retry_after_value: list[str] | str | None = None
         retry_after_header_name: Optional[str] = None
         duration_s: float = 1.0  # seconds
</file context>

✅ Addressed in 1a2badb

additional_headers=custom_tool_additional_headers,
),
allowed_tool_ids=new_msg_req.allowed_tool_ids,
slack_context=new_msg_req.slack_context, # Pass Slack context from request
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing slack_context causes SearchTool to log full SlackContext at info level (user_id/channel_id), risking PII exposure; prefer redaction or debug-level logging.

Prompt for AI agents
Address the following comment on backend/onyx/chat/process_message.py at line 606:

<comment>Passing slack_context causes SearchTool to log full SlackContext at info level (user_id/channel_id), risking PII exposure; prefer redaction or debug-level logging.</comment>

<file context>
@@ -603,6 +603,7 @@ def stream_chat_message_objects(
                 additional_headers=custom_tool_additional_headers,
             ),
             allowed_tool_ids=new_msg_req.allowed_tool_ids,
+            slack_context=new_msg_req.slack_context,  # Pass Slack context from request
         )
 
</file context>
Fix with Cubic

access_token: str,
limit: int | None = None,
# Slack-specific parameters
slack_event_context: SlackContext | None = None,
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaking Slack-specific type into the generic FederatedConnector.search signature reduces modularity; use a generic type (e.g., Any or a protocol) or a generic context dict instead.

Prompt for AI agents
Address the following comment on backend/onyx/federated_connectors/interfaces.py at line 91:

<comment>Leaking Slack-specific type into the generic FederatedConnector.search signature reduces modularity; use a generic type (e.g., Any or a protocol) or a generic context dict instead.</comment>

<file context>
@@ -86,6 +87,9 @@ def search(
         access_token: str,
         limit: int | None = None,
+        # Slack-specific parameters
+        slack_event_context: SlackContext | None = None,
+        bot_token: str | None = None,
     ) -&gt; list[InferenceChunk]:
</file context>
Fix with Cubic

Returns:
Search results in SlackSearchResponse format
"""
logger.info(f"Slack federated search called with entities: {entities}")
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid logging full entities at INFO; log minimal or redacted metadata to reduce potential leakage in logs.

Prompt for AI agents
Address the following comment on backend/onyx/federated_connectors/slack/federated_connector.py at line 241:

<comment>Avoid logging full entities at INFO; log minimal or redacted metadata to reduce potential leakage in logs.</comment>

<file context>
@@ -229,9 +232,20 @@ def search(
         Returns:
             Search results in SlackSearchResponse format
         &quot;&quot;&quot;
+        logger.info(f&quot;Slack federated search called with entities: {entities}&quot;)
+
         with get_session_with_current_tenant() as db_session:
</file context>
Suggested change
logger.info(f"Slack federated search called with entities: {entities}")
logger.info("Slack federated search called", extra={"has_entities": bool(entities)})
Fix with Cubic

f"Skipping message from private channel {channel_id} "
f"(not the allowed private channel: {allowed_private_channel})"
)
return True
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On any exception during channel type lookup, all messages are filtered out, often resulting in zero results.

Prompt for AI agents
Address the following comment on backend/onyx/context/search/federated/slack_search.py at line 71:

<comment>On any exception during channel type lookup, all messages are filtered out, often resulting in zero results.</comment>

<file context>
@@ -39,6 +41,42 @@
+                        f&quot;Skipping message from private channel {channel_id} &quot;
+                        f&quot;(not the allowed private channel: {allowed_private_channel})&quot;
+                    )
+                    return True
+        except Exception as e:
+            logger.warning(
</file context>
Fix with Cubic


env:
# OpenAI
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate env key OPENAI_API_KEY is defined twice in the same env mapping; remove one to avoid confusion and potential YAML duplicate-key issues.

Prompt for AI agents
Address the following comment on .github/workflows/pr-external-dependency-unit-tests.yml at line 10:

<comment>Duplicate env key OPENAI_API_KEY is defined twice in the same env mapping; remove one to avoid confusion and potential YAML duplicate-key issues.</comment>

<file context>
@@ -6,6 +6,9 @@ on:
 
 env:
+  # OpenAI
+  OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+  
   # AWS
</file context>

✅ Addressed in 1a2badb

# Use bot token if available (has full permissions), otherwise fall back to user token
token_to_use = bot_token or access_token
channel_client = WebClient(token=token_to_use)
channel_info = channel_client.conversations_info(channel=channel_id)
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling conversations_info per match can be slow and hit rate limits; cache channel type by channel_id or prefetch to reduce API calls.

Prompt for AI agents
Address the following comment on backend/onyx/context/search/federated/slack_search.py at line 60:

<comment>Calling conversations_info per match can be slow and hit rate limits; cache channel type by channel_id or prefetch to reduce API calls.</comment>

<file context>
@@ -39,6 +41,42 @@
+            # Use bot token if available (has full permissions), otherwise fall back to user token
+            token_to_use = bot_token or access_token
+            channel_client = WebClient(token=token_to_use)
+            channel_info = channel_client.conversations_info(channel=channel_id)
+
+            if isinstance(channel_info.data, dict) and not _is_public_channel(
</file context>
Fix with Cubic

logger.error(f"Slack API error in search_messages: {slack_error}")
logger.error(
f"Slack API error details: status={slack_error.response.status_code}, "
f"error={slack_error.response.get('error')}, response={slack_error.response}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not log entire API response objects; restrict logs to safe metadata (status, error). (Based on your team's feedback about logging only safe metadata rather than raw responses that may include sensitive info.)
Prompt for AI agents ~~~ Address the following comment on backend/onyx/context/search/federated/slack_search.py at line 147: Do not log entire API response objects; restrict logs to safe metadata (status, error). (Based on your team's feedback about logging only safe metadata rather than raw responses that may include sensitive info.) @@ -79,19 +139,36 @@ def query_slack( + logger.error(f"Slack API error in search_messages: {slack_error}") + logger.error( + f"Slack API error details: status={slack_error.response.status_code}, " + f"error={slack_error.response.get('error')}, response={slack_error.response}" + ) + if "not_allowed_token_type" in str(slack_error): ~~~
```suggestion f"error={slack_error.response.get('error')}" ```

)
if "not_allowed_token_type" in str(slack_error):
logger.error(
f"TOKEN TYPE ERROR: access_token starts with: {access_token[:10]}..."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not log any part of OAuth tokens; remove token prefix from logs. (Based on your team's feedback about avoiding logging raw exception details/tokens and preferring safe metadata.)
Prompt for AI agents ~~~ Address the following comment on backend/onyx/context/search/federated/slack_search.py at line 151: Do not log any part of OAuth tokens; remove token prefix from logs. (Based on your team's feedback about avoiding logging raw exception details/tokens and preferring safe metadata.) @@ -79,19 +139,36 @@ def query_slack( + ) + if "not_allowed_token_type" in str(slack_error): + logger.error( + f"TOKEN TYPE ERROR: access_token starts with: {access_token[:10]}..." + ) return [] ~~~
```suggestion "TOKEN TYPE ERROR: access_token type not allowed" ```


return OAuthResult(
access_token=access_token,
access_token=user_token, # Bot token for bot operations
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OAuthResult is assigning the user token to access_token, which mislabels tokens and will break downstream auth flows. Populate user_token with the user token and keep access_token for the bot token returned at the top level of the OAuth response.

Prompt for AI agents
Address the following comment on backend/onyx/federated_connectors/slack/federated_connector.py at line 187:

<comment>OAuthResult is assigning the user token to access_token, which mislabels tokens and will break downstream auth flows. Populate user_token with the user token and keep access_token for the bot token returned at the top level of the OAuth response.</comment>

<file context>
@@ -183,7 +184,7 @@ def callback(self, callback_data: dict[str, Any], redirect_uri: str) -&gt; OAuthRes
 
         return OAuthResult(
-            access_token=access_token,
+            access_token=user_token,  # Bot token for bot operations
             token_type=token_type,
             scope=scope,
</file context>
Fix with Cubic

@jessicasingh7
Copy link
Contributor Author

Slack bot- fed search

@jessicasingh7 jessicasingh7 merged commit 77466e1 into main Sep 23, 2025
57 of 60 checks passed
@jessicasingh7 jessicasingh7 deleted the jessica/DAN-2328-support-for-slack-search branch September 23, 2025 02:19
brijsiyag-meesho pushed a commit to brijsiyag-meesho/onyx that referenced this pull request Sep 23, 2025
Co-authored-by: Jessica Singh <jessicasingh@Mac.attlocal.net>
Co-authored-by: Jessica Singh <jessicasingh@mac.lan>
razvanMiu pushed a commit to eea/danswer that referenced this pull request Oct 16, 2025
Co-authored-by: Jessica Singh <jessicasingh@Mac.attlocal.net>
Co-authored-by: Jessica Singh <jessicasingh@mac.lan>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants