-
Notifications
You must be signed in to change notification settings - Fork 2
fix(connector-builder-agents): Improve emoji detection in update_progress_log #122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ress_log - Add emoji library dependency (>=2.15.0,<3.0) to pyproject.toml - Replace limited Unicode range check (0x1F600-0x1F64F) with emoji.is_emoji() - Rename parameter from 'emoji' to 'emoji_char' to avoid shadowing module name - Now properly detects all emoji types including 📍, ✅, ❌, 🤝,⚠️ , 🛠️, etc. Fixes #120 Co-Authored-By: AJ Steers <[email protected]>
Original prompt from AJ Steers
|
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This Branch via MCPTo test the changes in this specific branch with an MCP client like Claude Desktop, use the following configuration: {
"mcpServers": {
"connector-builder-mcp-dev": {
"command": "uvx",
"args": ["--from", "git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1760120327-improve-emoji-detection", "connector-builder-mcp"]
}
}
} Testing This Branch via CLIYou can test this version of the MCP Server using the following CLI snippet: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1760120327-improve-emoji-detection#egg=airbyte-connector-builder-mcp' --help PR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
📝 WalkthroughWalkthroughAdds the Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Caller
participant Tools as tools.update_progress_log
participant EmojiLib as emoji
participant State as SessionState
Caller->>Tools: update_progress_log(message, session_state, emoji_char?)
alt emoji_char provided
Tools->>EmojiLib: is_emoji(emoji_char[0])
alt first char is emoji
Tools->>EmojiLib: maybe is_emoji(emoji_char[:2]) (check 2-char emoji)
Tools->>Tools: derive emoji_char (1 or 2 chars) and trim from message
Tools->>State: Append log using derived emoji_char
else first char not emoji
Tools->>State: Append log using default "📍" or provided emoji_char
end
else no emoji_char
Tools->>State: Append log with default "📍"
end
Tools-->>Caller: return
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
🔇 Additional comments (3)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
connector_builder_agents/src/tools.py (2)
202-204
: Consider usingemoji.emoji_list()
for multi-character emojis.The current implementation only extracts
message[0]
, which doesn't fully handle multi-character emoji sequences such as:
- Emojis with variation selectors (e.g.,
⚠️ is 2 characters)- Flag emojis (e.g., 🇺🇸 is 2 characters)
- Emojis with skin tone modifiers
Consider this refactor to properly extract complete emoji sequences:
- # Detect if the first character of message is an emoji: - if message and emoji.is_emoji(message[0]): - emoji_char, message = message[0], message[1:].lstrip() + # Detect if message starts with an emoji: + if message: + emojis = emoji.emoji_list(message) + if emojis and emojis[0]['match_start'] == 0: + emoji_char = emojis[0]['emoji'] + message = message[emojis[0]['match_end']:].lstrip()This ensures that variation selectors and compound emoji sequences are preserved in their entirety.
202-207
: Consider clarifying emoji precedence logic.The current implementation prioritizes emoji extracted from the message over the
emoji_char
parameter. While this works for current callers (none passemoji_char
explicitly), it might be counterintuitive for future usage.Consider making the parameter take precedence:
# Detect if message starts with an emoji: - if message and emoji.is_emoji(message[0]): + if emoji_char is None and message and emoji.is_emoji(message[0]): emoji_char, message = message[0], message[1:].lstrip() emoji_char = emoji_char or "📍"This ensures explicit parameter values aren't silently overridden, making the function behavior more predictable.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
connector_builder_agents/pyproject.toml
(1 hunks)connector_builder_agents/src/tools.py
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
connector_builder_agents/src/tools.py (2)
9-9
: LGTM!The import statement is clean and straightforward.
194-194
: LGTM! Good practice to avoid module shadowing.Renaming the parameter from
emoji
toemoji_char
prevents shadowing the importedemoji
module, improving code clarity. Since no callers in the codebase use this parameter with keyword arguments, this is a safe refactor.connector_builder_agents/pyproject.toml (1)
20-20
: Approve emoji dependency: version 2.15.0 is the latest release and has no known security vulnerabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
connector_builder_agents/src/tools.py
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Test Connector Build (PokemonTGG)
- GitHub Check: Test Connector Build (JSONPlaceholder)
- GitHub Check: Run Evals (Single Connector)
🔇 Additional comments (2)
connector_builder_agents/src/tools.py (2)
9-9
: LGTM: Clean import of emoji library.The import is properly placed with other third-party imports and supports the improved emoji detection functionality.
194-194
: LGTM: Parameter rename avoids module shadowing.Renaming the parameter from
emoji
toemoji_char
is a good practice that prevents shadowing the importedemoji
module and improves code clarity.
- Check if first 1 or 2 characters are emojis and keep them together - Properly strip emoji characters from message before logging - Addresses feedback from @aaronsteers on PR #122 Co-Authored-By: AJ Steers <[email protected]>
fix(connector-builder-agents): Improve emoji detection in update_progress_log
Summary
Replaces the limited Unicode range check for emoji detection in⚠️ , 🛠️, and others.
update_progress_log
function with the robustemoji
library. The previous implementation only detected emojis in the range 0x1F600-0x1F64F (emoticons block), missing many commonly used emojis throughout the codebase including 📍, ✅, ❌, 🤝,Changes:
emoji>=2.15.0,<3.0
dependency topyproject.toml
emoji
module intools.py
ord(message[0]) in range(0x1F600, 0x1F64F)
withemoji.is_emoji(message[0])
emoji
toemoji_char
to avoid module name shadowingReview & Testing Checklist for Human
Notes
Link to Devin run: https://app.devin.ai/sessions/511dd8b4a1fb489fb0d8cca4af4c1b5f
Requested by: @aaronsteers
Summary by CodeRabbit
Bug Fixes
Chores