Skip to content

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Oct 10, 2025

fix(connector-builder-agents): Improve emoji detection in update_progress_log

Summary

Replaces the limited Unicode range check for emoji detection in update_progress_log function with the robust emoji library. The previous implementation only detected emojis in the range 0x1F600-0x1F64F (emoticons block), missing many commonly used emojis throughout the codebase including 📍, ✅, ❌, 🤝, ⚠️, 🛠️, and others.

Changes:

  • Added emoji>=2.15.0,<3.0 dependency to pyproject.toml
  • Imported emoji module in tools.py
  • Replaced ord(message[0]) in range(0x1F600, 0x1F64F) with emoji.is_emoji(message[0])
  • Renamed parameter from emoji to emoji_char to avoid module name shadowing

Review & Testing Checklist for Human

  • Test emoji detection with various emoji types: Verify that emojis like 📍, ✅, ❌, 🤝, ⚠️, 🛠️, ⏳ are now properly detected and extracted from log messages
  • Check for regressions: Ensure existing emoji detection still works for emojis that were previously detected (😀, 😁, etc.)
  • Verify dependency compatibility: Confirm that the new emoji library doesn't conflict with existing dependencies in the environment

Notes


Link to Devin run: https://app.devin.ai/sessions/511dd8b4a1fb489fb0d8cca4af4c1b5f
Requested by: @aaronsteers

Summary by CodeRabbit

  • Bug Fixes

    • Progress logs now more reliably detect and extract leading emoji characters, trimming them from messages and using a consistent icon (falls back to a pin) for display.
    • Improved handling of user-provided custom emojis to prevent misclassification or missing icons.
  • Chores

    • Added an emoji support library to ensure consistent emoji detection and rendering.

…ress_log

- Add emoji library dependency (>=2.15.0,<3.0) to pyproject.toml
- Replace limited Unicode range check (0x1F600-0x1F64F) with emoji.is_emoji()
- Rename parameter from 'emoji' to 'emoji_char' to avoid shadowing module name
- Now properly detects all emoji types including 📍, ✅, ❌, 🤝, ⚠️, 🛠️, etc.

Fixes #120

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

Original prompt from AJ Steers
@Devin - Can you implement this? <https://github.com/airbytehq/connector-builder-mcp/issues/120>
Thread URL: https://airbytehq-team.slack.com/archives/D089P0UPVT4/p1760119502949799?thread_ts=1760119502.949799

Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions github-actions bot added the bug Something isn't working label Oct 10, 2025
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This Branch via MCP

To test the changes in this specific branch with an MCP client like Claude Desktop, use the following configuration:

{
  "mcpServers": {
    "connector-builder-mcp-dev": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1760120327-improve-emoji-detection", "connector-builder-mcp"]
    }
  }
}

Testing This Branch via CLI

You can test this version of the MCP Server using the following CLI snippet:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1760120327-improve-emoji-detection#egg=airbyte-connector-builder-mcp' --help

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poe <command> - Runs any poe command in the uv virtual environment
  • /poe build-connector prompt="Star Wars API" - Run the connector builder using the Star Wars API.

📝 Edit this welcome message.

Copy link

coderabbitai bot commented Oct 10, 2025

📝 Walkthrough

Walkthrough

Adds the emoji package dependency and updates update_progress_log to emoji_char, using emoji.is_emoji to detect and trim leading emoji (one or two chars) from messages; imports and variable names adjusted. No other control flow changes.

Changes

Cohort / File(s) Summary of changes
Dependency management
connector_builder_agents/pyproject.toml
Added dependency emoji>=2.15.0,<3.0 under [project] dependencies.
Progress log emoji handling
connector_builder_agents/src/tools.py
Imported emoji. Renamed parameter emojiemoji_char. Use emoji.is_emoji to detect leading emoji (taking one or two chars), derive emoji_char from the leading emoji when present, trim it from the message, and default to "📍" when none supplied/valid. Updated log construction to use emoji_char.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant Tools as tools.update_progress_log
  participant EmojiLib as emoji
  participant State as SessionState

  Caller->>Tools: update_progress_log(message, session_state, emoji_char?)
  alt emoji_char provided
    Tools->>EmojiLib: is_emoji(emoji_char[0])
    alt first char is emoji
      Tools->>EmojiLib: maybe is_emoji(emoji_char[:2]) (check 2-char emoji)
      Tools->>Tools: derive emoji_char (1 or 2 chars) and trim from message
      Tools->>State: Append log using derived emoji_char
    else first char not emoji
      Tools->>State: Append log using default "📍" or provided emoji_char
    end
  else no emoji_char
    Tools->>State: Append log with default "📍"
  end
  Tools-->>Caller: return
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title concisely describes the primary change by indicating a fix to improve emoji detection in the update_progress_log function, uses the conventional prefix, and omits irrelevant details, making it clear and specific for reviewers.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1760120327-improve-emoji-detection

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 222c253 and 051121f.

📒 Files selected for processing (1)
  • connector_builder_agents/src/tools.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run Evals (Single Connector)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Test Connector Build (PokemonTGG)
  • GitHub Check: Test Connector Build (JSONPlaceholder)
🔇 Additional comments (3)
connector_builder_agents/src/tools.py (3)

9-9: LGTM!

The import of the emoji library is correctly placed and necessary for the new emoji detection logic.


194-194: LGTM!

Renaming the parameter from emoji to emoji_char correctly avoids shadowing the imported emoji module.


209-210: LGTM!

The fallback logic and usage of emoji_char in the update string are correct.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

github-actions bot commented Oct 10, 2025

PyTest Results (Fast)

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
0 files   ±0   0 ❌ ±0 

Results for commit 051121f. ± Comparison against base commit 5fcf3d9.

♻️ This comment has been updated with latest results.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
connector_builder_agents/src/tools.py (2)

202-204: Consider using emoji.emoji_list() for multi-character emojis.

The current implementation only extracts message[0], which doesn't fully handle multi-character emoji sequences such as:

  • Emojis with variation selectors (e.g., ⚠️ is 2 characters)
  • Flag emojis (e.g., 🇺🇸 is 2 characters)
  • Emojis with skin tone modifiers

Consider this refactor to properly extract complete emoji sequences:

-    # Detect if the first character of message is an emoji:
-    if message and emoji.is_emoji(message[0]):
-        emoji_char, message = message[0], message[1:].lstrip()
+    # Detect if message starts with an emoji:
+    if message:
+        emojis = emoji.emoji_list(message)
+        if emojis and emojis[0]['match_start'] == 0:
+            emoji_char = emojis[0]['emoji']
+            message = message[emojis[0]['match_end']:].lstrip()

This ensures that variation selectors and compound emoji sequences are preserved in their entirety.


202-207: Consider clarifying emoji precedence logic.

The current implementation prioritizes emoji extracted from the message over the emoji_char parameter. While this works for current callers (none pass emoji_char explicitly), it might be counterintuitive for future usage.

Consider making the parameter take precedence:

     # Detect if message starts with an emoji:
-    if message and emoji.is_emoji(message[0]):
+    if emoji_char is None and message and emoji.is_emoji(message[0]):
         emoji_char, message = message[0], message[1:].lstrip()
 
     emoji_char = emoji_char or "📍"

This ensures explicit parameter values aren't silently overridden, making the function behavior more predictable.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 160d5dc and d5d74d0.

📒 Files selected for processing (2)
  • connector_builder_agents/pyproject.toml (1 hunks)
  • connector_builder_agents/src/tools.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
connector_builder_agents/src/tools.py (2)

9-9: LGTM!

The import statement is clean and straightforward.


194-194: LGTM! Good practice to avoid module shadowing.

Renaming the parameter from emoji to emoji_char prevents shadowing the imported emoji module, improving code clarity. Since no callers in the codebase use this parameter with keyword arguments, this is a safe refactor.

connector_builder_agents/pyproject.toml (1)

20-20: Approve emoji dependency: version 2.15.0 is the latest release and has no known security vulnerabilities.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5d74d0 and 222c253.

📒 Files selected for processing (1)
  • connector_builder_agents/src/tools.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test Connector Build (PokemonTGG)
  • GitHub Check: Test Connector Build (JSONPlaceholder)
  • GitHub Check: Run Evals (Single Connector)
🔇 Additional comments (2)
connector_builder_agents/src/tools.py (2)

9-9: LGTM: Clean import of emoji library.

The import is properly placed with other third-party imports and supports the improved emoji detection functionality.


194-194: LGTM: Parameter rename avoids module shadowing.

Renaming the parameter from emoji to emoji_char is a good practice that prevents shadowing the imported emoji module and improves code clarity.

- Check if first 1 or 2 characters are emojis and keep them together
- Properly strip emoji characters from message before logging
- Addresses feedback from @aaronsteers on PR #122

Co-Authored-By: AJ Steers <[email protected]>
@aaronsteers aaronsteers merged commit 20de60b into main Oct 10, 2025
17 checks passed
@aaronsteers aaronsteers deleted the devin/1760120327-improve-emoji-detection branch October 10, 2025 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant