-
Notifications
You must be signed in to change notification settings - Fork 63
feat: Improved WebSelect Handling: Added Combobox Support, Enhanced Element Detection, and Smarter Option Matching #679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
1) Added Webselect for Input/Dropdown Combobox PR for issue Second-Hand-Friends#677 Fixes: 2) Finding Special Attributes Elements can fail because they are currently only selected using the name="..." attributes of the HTML elements. If it fails, ALSO fallback-handle selecting special attribute HTML elements by ID instead / additionally. (For example the "brands" Input/Combobox for Mens Shoes... 3) When trying to select a Value in a <select>, it does not only rely on the actual Option value (xxx in the example <options value="xxx">yyy</...>) but instead also on the displayed HTML value (i.e. yyy in above example). This improves UX because the User doesnt have to check the actual "value" of the Option but instead can check the displayed Value from the Browsers Display directly.
WalkthroughName-based XPath lookup now falls back to an ID-based XPath before failing. Text inputs with role="combobox" are routed to a new combobox selection path. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Init as __init__.py
participant Mixin as web_scraping_mixin.py
participant DOM
Caller->>Init: set_special_attribute(name, value)
Init->>Init: XPath lookup by name (with timeout)
alt found by name
Init->>Init: inspect element type & role
else not found by name
Init->>Init: XPath lookup by ID (fallback)
alt found by ID
Init->>Init: inspect element type & role
else not found by ID
Init-->>Caller: raise TimeoutError (cannot find by Name or ID)
end
end
alt role == "combobox"
Init->>Mixin: web_select_combobox(selector, value)
Mixin->>DOM: locate input, clear input, send keys (typed value)
Mixin->>DOM: read aria-controls -> find dropdown
Mixin->>DOM: normalize option texts -> click matching <li> (scrollIntoView)
Mixin-->>Init: return dropdown element
else role == "select"
Init->>Mixin: web_select(selector, value)
Mixin->>DOM: try match by option.value -> else by trimmed visible text -> click option
Mixin-->>Init: success / TimeoutError if no match
else checkbox/text
Init->>DOM: set value directly
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (1)src/kleinanzeigen_bot/**/*.py⚙️ CodeRabbit configuration file
Files:
🧬 Code graph analysis (1)src/kleinanzeigen_bot/__init__.py (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
🔇 Additional comments (7)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
src/kleinanzeigen_bot/__init__.py(2 hunks)src/kleinanzeigen_bot/utils/web_scraping_mixin.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CONTRIBUTING.md)
**/*.py: All Python files must start with SPDX license headers (FileCopyrightText, License-Identifier, ArtifactOfProjectHomePage)
Group imports as: standard library, third-party, then local, with blank lines between groups
Always use type hints for function parameters and return values
Use Any for complex types when necessary
Use Final for constants
Use typing.cast() when the type checker needs help
Use docstrings for complex functions and classes that need explanation; include examples for complex functions
Use comments to explain code logic and reasoning, especially for algorithms, business logic, and non-obvious decisions
Use structured logging via loggers.get_logger(), include context, and choose appropriate log levels
Use specific exception types and include meaningful error messages
Use pyright: ignore[reportAttributeAccessIssue] comments only for known type-checker issues
Files:
src/kleinanzeigen_bot/__init__.pysrc/kleinanzeigen_bot/utils/web_scraping_mixin.py
src/kleinanzeigen_bot/**/*.py
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Use absolute imports for project modules (e.g., from kleinanzeigen_bot import KleinanzeigenBot)
Files:
src/kleinanzeigen_bot/__init__.pysrc/kleinanzeigen_bot/utils/web_scraping_mixin.py
⚙️ CodeRabbit configuration file
src/kleinanzeigen_bot/**/*.py: CRITICAL RULES FOR KLEINANZEIGEN BOT:
- ALL code, comments, and text MUST be in English
- User-facing messages MUST use translation system (_()) function
- NEVER access live website in tests (bot detection risk)
- Use WebScrapingMixin for browser automation
- Handle TimeoutError for all web operations
- Use ensure() for critical validations
- Don't add features until explicitly needed
- Keep solutions simple and straightforward
- Use async/await for I/O operations
- Follow Pydantic model patterns
- Use proper error handling and logging
- Test business logic separately from web scraping
- Include SPDX license headers on all Python files
- Use type hints for all function parameters and return values
- Use structured logging with context
Files:
src/kleinanzeigen_bot/__init__.pysrc/kleinanzeigen_bot/utils/web_scraping_mixin.py
**/__init__.py
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Add module docstrings for packages (and complex modules) to document purpose and contents
Files:
src/kleinanzeigen_bot/__init__.py
src/**/*.py
📄 CodeRabbit inference engine (CONTRIBUTING.md)
src/**/*.py: All user-facing output (logs, prints, CLI help) must be written in English
Use the translation system for all output; never hardcode German or other non-English strings in code
Files:
src/kleinanzeigen_bot/__init__.pysrc/kleinanzeigen_bot/utils/web_scraping_mixin.py
🧬 Code graph analysis (1)
src/kleinanzeigen_bot/__init__.py (1)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py (3)
web_find(655-698)By(71-77)web_select_combobox(884-921)
🪛 Ruff (0.14.4)
src/kleinanzeigen_bot/__init__.py
1068-1068: Avoid specifying long messages outside the exception class
(TRY003)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py
884-884: Dynamically typed expressions (typing.Any) are disallowed in selected_value
(ANN401)
901-901: Avoid specifying long messages outside the exception class
(TRY003)
918-918: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: build
- GitHub Check: build (windows-latest, 3.10, false)
- GitHub Check: build (macos-latest, 3.14, true)
- GitHub Check: build (windows-latest, 3.14, true)
- GitHub Check: build (macos-15-intel, 3.10, false)
- GitHub Check: build (ubuntu-latest, 3.10, false)
- GitHub Check: build (macos-15-intel, 3.14, true)
🔇 Additional comments (3)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py (1)
851-882: Good security practice with JSON escaping for JavaScript injection.The updated logic correctly tries matching by value first, then falls back to displayed text. Using
json.dumps()for safe JavaScript string escaping is the right approach to prevent injection issues.src/kleinanzeigen_bot/__init__.py (2)
1062-1068: Enhanced element detection with ID fallback is a good improvement.The fallback logic correctly attempts to find elements by ID when the XPath name search fails, improving robustness for elements without name attributes (like the "brands" input mentioned in the PR).
However, the static analysis tool flags the long error message at line 1068. Consider extracting this to a custom exception type or a constant for better maintainability, though this is a minor concern.
As per coding guidelines: "Use specific exception types and include meaningful error messages."
1078-1080: Clean integration of combobox handling.The new branch correctly detects text inputs with
role="combobox"and routes them to the appropriateweb_select_comboboxmethod. This aligns well with the PR objectives to support input/dropdown combobox interactions.
1cu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combobox handling is heading the right way, but I’d block this until a couple of issues are addressed:
- Translation gate (tests/unit/test_translations.py) –
web_select_comboboxadds two plain-English log strings (“Combobox input field does not have 'aria-controls'…”, “No<li>options found…”). Because they’re not wrapped in_()/
absent fromresources/translations.de.yaml, the translation sweep fails across the entire Build matrix. Wrap the strings in_()and add the German entries (plus optionaltranslations.en.yamloverrides), then rerunpdm run utest. - Exception mismatch (src/kleinanzeigen_bot/init.py:1059-1086 & utils/web_scraping_mixin.py:884-921) – the new helper raises
AssertionErrorwhen the combobox can’t be resolved, but the caller only catchesTimeoutError. That
bubbles up raw assertions without the usual “failed to set special attribute” logging. Either convert the assertion sites toraise TimeoutError(...)(or useensure(..., exc=TimeoutError)), or broaden the caller’sexceptso combobox
failures produce consistent diagnostics. - Selection correctness (utils/web_scraping_mixin.py:904-917) – after typing the filter string, the helper blindly clicks the first
<li>. Kleinanzeigen comboboxes commonly return multiple suggestions even after full input, so this
can silently choose the wrong value. Please iterate the<li>nodes (or inspect theirdata-*attributes) and only click when the visible text equalsselected_value; otherwise raise. - Missing tests – the PR changes both the
<select>JS helper (label fallback) and introduces an entirely new combobox workflow, yet no tests were updated. Add targeted unit/coroutine tests so regressions in the JS selection logic
andweb_select_comboboxbehavior are caught before hitting CI.
Once these are fixed (translations, exception consistency, true matching, and tests), I’ll gladly take another look.
1) TimeoutError Exception instead of AssertionError 2) Added translations _() and changed translations.de.yaml 3) Improved <li> Selection for Combobox based on actual displayed value Ran PDM Lint / and PDM Formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
src/kleinanzeigen_bot/resources/translations.de.yamlis excluded by none and included by none
📒 Files selected for processing (2)
src/kleinanzeigen_bot/__init__.py(2 hunks)src/kleinanzeigen_bot/utils/web_scraping_mixin.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CONTRIBUTING.md)
**/*.py: All Python files must start with SPDX license headers (FileCopyrightText, License-Identifier, ArtifactOfProjectHomePage)
Group imports as: standard library, third-party, then local, with blank lines between groups
Always use type hints for function parameters and return values
Use Any for complex types when necessary
Use Final for constants
Use typing.cast() when the type checker needs help
Use docstrings for complex functions and classes that need explanation; include examples for complex functions
Use comments to explain code logic and reasoning, especially for algorithms, business logic, and non-obvious decisions
Use structured logging via loggers.get_logger(), include context, and choose appropriate log levels
Use specific exception types and include meaningful error messages
Use pyright: ignore[reportAttributeAccessIssue] comments only for known type-checker issues
Files:
src/kleinanzeigen_bot/utils/web_scraping_mixin.pysrc/kleinanzeigen_bot/__init__.py
src/kleinanzeigen_bot/**/*.py
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Use absolute imports for project modules (e.g., from kleinanzeigen_bot import KleinanzeigenBot)
Files:
src/kleinanzeigen_bot/utils/web_scraping_mixin.pysrc/kleinanzeigen_bot/__init__.py
⚙️ CodeRabbit configuration file
src/kleinanzeigen_bot/**/*.py: CRITICAL RULES FOR KLEINANZEIGEN BOT:
- ALL code, comments, and text MUST be in English
- User-facing messages MUST use translation system (_()) function
- NEVER access live website in tests (bot detection risk)
- Use WebScrapingMixin for browser automation
- Handle TimeoutError for all web operations
- Use ensure() for critical validations
- Don't add features until explicitly needed
- Keep solutions simple and straightforward
- Use async/await for I/O operations
- Follow Pydantic model patterns
- Use proper error handling and logging
- Test business logic separately from web scraping
- Include SPDX license headers on all Python files
- Use type hints for all function parameters and return values
- Use structured logging with context
Files:
src/kleinanzeigen_bot/utils/web_scraping_mixin.pysrc/kleinanzeigen_bot/__init__.py
src/**/*.py
📄 CodeRabbit inference engine (CONTRIBUTING.md)
src/**/*.py: All user-facing output (logs, prints, CLI help) must be written in English
Use the translation system for all output; never hardcode German or other non-English strings in code
Files:
src/kleinanzeigen_bot/utils/web_scraping_mixin.pysrc/kleinanzeigen_bot/__init__.py
**/__init__.py
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Add module docstrings for packages (and complex modules) to document purpose and contents
Files:
src/kleinanzeigen_bot/__init__.py
🧬 Code graph analysis (1)
src/kleinanzeigen_bot/__init__.py (1)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py (3)
web_find(655-698)By(71-77)web_select_combobox(884-944)
🪛 Ruff (0.14.4)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py
884-884: Dynamically typed expressions (typing.Any) are disallowed in selected_value
(ANN401)
src/kleinanzeigen_bot/__init__.py
1068-1068: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: analyze
- GitHub Check: build (macos-latest, 3.14, true)
- GitHub Check: build (windows-latest, 3.14, true)
- GitHub Check: build (ubuntu-latest, 3.14, true)
- GitHub Check: build (windows-latest, 3.10, false)
- GitHub Check: build (macos-latest, 3.10, false)
- GitHub Check: build (macos-15-intel, 3.10, false)
- GitHub Check: build (macos-15-intel, 3.14, true)
- GitHub Check: build (ubuntu-latest, 3.10, false)
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #679 +/- ##
==========================================
+ Coverage 84.24% 84.67% +0.42%
==========================================
Files 20 20
Lines 2019 2042 +23
Branches 393 396 +3
==========================================
+ Hits 1701 1729 +28
+ Misses 182 177 -5
Partials 136 136
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
|
Added new unit tests covering the updated combobox selection. The patch below can be applied locally if pushes are disabled. From be37fd8516baf2db1ea508aa35041450160464f3 Mon Sep 17 00:00:00 2001
From: Jens <[email protected]>
Date: Fri, 14 Nov 2025 14:19:12 +0100
Subject: [PATCH] test: cover combobox selection
---
tests/unit/test_web_scraping_mixin.py | 56 +++++++++++++++++++++++++++
1 file changed, 56 insertions(+)
diff --git a/tests/unit/test_web_scraping_mixin.py b/tests/unit/test_web_scraping_mixin.py
index d205a3d..989e0a3 100644
--- a/tests/unit/test_web_scraping_mixin.py
+++ b/tests/unit/test_web_scraping_mixin.py
@@ -156,6 +156,62 @@ class TestWebScrapingErrorHandling:
with pytest.raises(Exception, match = "Cannot clear input"):
await web_scraper.web_input(By.ID, "test-id", "test text")
+ @pytest.mark.asyncio
+ async def test_web_select_combobox_missing_dropdown_options(self, web_scraper:WebScrapingMixin) -> None:
+ """Test combobox selection when aria-controls attribute is missing."""
+ input_field = AsyncMock(spec = Element)
+ input_field.attrs = {}
+ input_field.clear_input = AsyncMock()
+ input_field.send_keys = AsyncMock()
+ web_scraper.web_find = AsyncMock(return_value = input_field)
+
+ with pytest.raises(TimeoutError, match = "Cannot locate combobox dropdown options."):
+ await web_scraper.web_select_combobox(By.ID, "combo-id", "Option", timeout = 0.1)
+
+ input_field.clear_input.assert_awaited_once()
+ input_field.send_keys.assert_awaited_once_with("Option")
+
+ @pytest.mark.asyncio
+ async def test_web_select_combobox_selects_matching_option(self, web_scraper:WebScrapingMixin) -> None:
+ """Test combobox selection matches a visible <li> option."""
+ input_field = AsyncMock(spec = Element)
+ input_field.attrs = {"aria-controls": "dropdown-id"}
+ input_field.clear_input = AsyncMock()
+ input_field.send_keys = AsyncMock()
+
+ dropdown_elem = AsyncMock(spec = Element)
+ dropdown_elem.apply = AsyncMock(return_value = True)
+
+ web_scraper.web_find = AsyncMock(side_effect = [input_field, dropdown_elem])
+ web_scraper.web_sleep = AsyncMock()
+
+ result = await web_scraper.web_select_combobox(By.ID, "combo-id", "Visible Label")
+
+ assert result is dropdown_elem
+ input_field.clear_input.assert_awaited_once()
+ input_field.send_keys.assert_awaited_once_with("Visible Label")
+ dropdown_elem.apply.assert_awaited_once()
+ assert web_scraper.web_sleep.await_count == 2
+
+ @pytest.mark.asyncio
+ async def test_web_select_combobox_no_matching_option_raises(self, web_scraper:WebScrapingMixin) -> None:
+ """Test combobox selection raises when no <li> matches the entered text."""
+ input_field = AsyncMock(spec = Element)
+ input_field.attrs = {"aria-controls": "dropdown-id"}
+ input_field.clear_input = AsyncMock()
+ input_field.send_keys = AsyncMock()
+
+ dropdown_elem = AsyncMock(spec = Element)
+ dropdown_elem.apply = AsyncMock(return_value = False)
+
+ web_scraper.web_find = AsyncMock(side_effect = [input_field, dropdown_elem])
+ web_scraper.web_sleep = AsyncMock()
+
+ with pytest.raises(TimeoutError, match = "Cannot locate combobox dropdown options."):
+ await web_scraper.web_select_combobox(By.ID, "combo-id", "Missing Label")
+
+ dropdown_elem.apply.assert_awaited_once()
+
@pytest.mark.asyncio
async def test_web_open_timeout(self, web_scraper:WebScrapingMixin, mock_browser:AsyncMock) -> None:
"""Test page load timeout in web_open."""
--
2.50.1 (Apple Git-155) |
|
@1cu So hopefully the PR is complete now and can be merged into the official repo and finally into the release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
tests/unit/test_web_scraping_mixin.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
tests/**/*.py
⚙️ CodeRabbit configuration file
tests/**/*.py: TESTING RULES:
- NEVER access live website in tests (bot detection risk)
- Use @patch for web operations in tests
- Use test fixtures for browser automation
- Test Pydantic models without web scraping
- Mock all web operations in tests
- Use pytest markers: unit, integration, smoke
- Unit tests: fast, isolated, no external dependencies
- Integration tests: use mocks, test with external dependencies
- Smoke tests: critical path, no mocks, no browser (NOT E2E tests)
- All test code must be in English
- Test observable behavior, not implementation
- Use fakes/dummies instead of mocks in smoke tests
- Focus on minimal health checks, not full user workflows
- Include SPDX license headers
- Use descriptive test names in English
Files:
tests/unit/test_web_scraping_mixin.py
🧬 Code graph analysis (1)
tests/unit/test_web_scraping_mixin.py (1)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py (4)
web_find(655-698)web_select_combobox(884-944)By(71-77)web_sleep(781-785)
🪛 Ruff (0.14.4)
tests/unit/test_web_scraping_mixin.py
168-168: Pattern passed to match= contains metacharacters but is neither escaped nor raw
(RUF043)
210-210: Pattern passed to match= contains metacharacters but is neither escaped nor raw
(RUF043)
🔇 Additional comments (1)
tests/unit/test_web_scraping_mixin.py (1)
174-194: LGTM!This test correctly validates the success path: input field is cleared, text is sent, the dropdown element is located via
aria-controls, the JavaScript selection is applied, andweb_sleepis called twice as expected. The assertion onweb_sleep.await_count == 2properly verifies the production code's behavior.
|
@bjspi Please address the merge conflict and the passing of the timeout. When lints and tests pass I will approve this PR. |
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (2)
tests/unit/test_web_scraping_mixin.py (2)
178-181: Escape the period in regex match.Use a raw string so the trailing period is matched literally.
- with pytest.raises(TimeoutError, match="Cannot locate combobox dropdown options."): + with pytest.raises(TimeoutError, match=r"Cannot locate combobox dropdown options\."):
220-222: Escape the period in regex match (again).Apply the same fix here:
- with pytest.raises(TimeoutError, match="Cannot locate combobox dropdown options."): + with pytest.raises(TimeoutError, match=r"Cannot locate combobox dropdown options\."):
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
src/kleinanzeigen_bot/resources/translations.de.yamlis excluded by none and included by none
📒 Files selected for processing (3)
src/kleinanzeigen_bot/__init__.py(2 hunks)src/kleinanzeigen_bot/utils/web_scraping_mixin.py(1 hunks)tests/unit/test_web_scraping_mixin.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
tests/**/*.py
⚙️ CodeRabbit configuration file
tests/**/*.py: TESTING RULES:
- NEVER access live website in tests (bot detection risk)
- Use @patch for web operations in tests
- Use test fixtures for browser automation
- Test Pydantic models without web scraping
- Mock all web operations in tests
- Use pytest markers: unit, integration, smoke
- Unit tests: fast, isolated, no external dependencies
- Integration tests: use mocks, test with external dependencies
- Smoke tests: critical path, no mocks, no browser (NOT E2E tests)
- All test code must be in English
- Test observable behavior, not implementation
- Use fakes/dummies instead of mocks in smoke tests
- Focus on minimal health checks, not full user workflows
- Include SPDX license headers
- Use descriptive test names in English
Files:
tests/unit/test_web_scraping_mixin.py
src/kleinanzeigen_bot/**/*.py
⚙️ CodeRabbit configuration file
src/kleinanzeigen_bot/**/*.py: CRITICAL RULES FOR KLEINANZEIGEN BOT:
- ALL code, comments, and text MUST be in English
- User-facing messages MUST use translation system (_()) function
- NEVER access live website in tests (bot detection risk)
- Use WebScrapingMixin for browser automation
- Handle TimeoutError for all web operations
- Use ensure() for critical validations
- Don't add features until explicitly needed
- Keep solutions simple and straightforward
- Use async/await for I/O operations
- Follow Pydantic model patterns
- Use proper error handling and logging
- Test business logic separately from web scraping
- Include SPDX license headers on all Python files
- Use type hints for all function parameters and return values
- Use structured logging with context
Files:
src/kleinanzeigen_bot/__init__.pysrc/kleinanzeigen_bot/utils/web_scraping_mixin.py
🧬 Code graph analysis (2)
tests/unit/test_web_scraping_mixin.py (1)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py (4)
web_find(729-745)web_select_combobox(1002-1062)By(75-81)web_sleep(899-903)
src/kleinanzeigen_bot/__init__.py (1)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py (3)
web_find(729-745)By(75-81)web_select_combobox(1002-1062)
🪛 Ruff (0.14.4)
tests/unit/test_web_scraping_mixin.py
178-178: Pattern passed to match= contains metacharacters but is neither escaped nor raw
(RUF043)
220-220: Pattern passed to match= contains metacharacters but is neither escaped nor raw
(RUF043)
src/kleinanzeigen_bot/__init__.py
1077-1077: Avoid specifying long messages outside the exception class
(TRY003)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py
1002-1002: Dynamically typed expressions (typing.Any) are disallowed in selected_value
(ANN401)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: build (macos-15-intel, 3.14, true)
- GitHub Check: build (windows-latest, 3.14, true)
- GitHub Check: build (windows-latest, 3.10, false)
- GitHub Check: build (macos-15-intel, 3.10, false)
- GitHub Check: analyze
🔇 Additional comments (2)
src/kleinanzeigen_bot/__init__.py (1)
1087-1090: Combobox routing looks good.Detection by type="text" and role="combobox" and delegating to web_select_combobox aligns with the new helper.
tests/unit/test_web_scraping_mixin.py (1)
185-205: Async combobox selection test is solid.Covers typing, matching, return value, and pauses; mocks are isolated and fast.
This commit addresses all review comments and CodeRabbit suggestions: - Fix combobox type annotation (Any -> str | int) to resolve ANN401 - Update docstring to accurately describe combobox behavior - Add distinct error messages for different failure scenarios - Honor timeout parameter in web_select (pass to web_find) - Add exception handling to convert JS errors to TimeoutError - Localize error message in __set_special_attributes - Add test for special character handling (quotes, backslashes) - Update translations for all new error messages Changes improve error clarity, type safety, and maintainability. Co-authored-by: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py (1)
964-1004: web_select selection logic is solid; tighten types and docstring.The new two-phase JS selection (by
option.valuethen by trimmed label), combined with JSON escaping and normalization to a localized TimeoutError, is a good, maintainable improvement and correctly honors the caller’s timeout.Two small polish points:
selected_value: Anycan likely be narrowed tostr | int(mirroringweb_input/web_select_comboboxand avoiding ANN401-style “Any” usage).- The docstring still advertises
UnexpectedTagNameException, which is never raised here; consider either removing that or explicitly checking the tag name and raising a custom error.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
src/kleinanzeigen_bot/resources/translations.de.yamlis excluded by none and included by none
📒 Files selected for processing (3)
src/kleinanzeigen_bot/__init__.py(2 hunks)src/kleinanzeigen_bot/utils/web_scraping_mixin.py(1 hunks)tests/unit/test_web_scraping_mixin.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
src/kleinanzeigen_bot/**/*.py
⚙️ CodeRabbit configuration file
src/kleinanzeigen_bot/**/*.py: CRITICAL RULES FOR KLEINANZEIGEN BOT:
- ALL code, comments, and text MUST be in English
- User-facing messages MUST use translation system (_()) function
- NEVER access live website in tests (bot detection risk)
- Use WebScrapingMixin for browser automation
- Handle TimeoutError for all web operations
- Use ensure() for critical validations
- Don't add features until explicitly needed
- Keep solutions simple and straightforward
- Use async/await for I/O operations
- Follow Pydantic model patterns
- Use proper error handling and logging
- Test business logic separately from web scraping
- Include SPDX license headers on all Python files
- Use type hints for all function parameters and return values
- Use structured logging with context
Files:
src/kleinanzeigen_bot/__init__.pysrc/kleinanzeigen_bot/utils/web_scraping_mixin.py
tests/**/*.py
⚙️ CodeRabbit configuration file
tests/**/*.py: TESTING RULES:
- NEVER access live website in tests (bot detection risk)
- Use @patch for web operations in tests
- Use test fixtures for browser automation
- Test Pydantic models without web scraping
- Mock all web operations in tests
- Use pytest markers: unit, integration, smoke
- Unit tests: fast, isolated, no external dependencies
- Integration tests: use mocks, test with external dependencies
- Smoke tests: critical path, no mocks, no browser (NOT E2E tests)
- All test code must be in English
- Test observable behavior, not implementation
- Use fakes/dummies instead of mocks in smoke tests
- Focus on minimal health checks, not full user workflows
- Include SPDX license headers
- Use descriptive test names in English
Files:
tests/unit/test_web_scraping_mixin.py
🧬 Code graph analysis (2)
src/kleinanzeigen_bot/__init__.py (1)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py (3)
web_find(729-745)By(75-81)web_select_combobox(1006-1070)
tests/unit/test_web_scraping_mixin.py (1)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py (4)
web_find(729-745)web_select_combobox(1006-1070)By(75-81)web_sleep(899-903)
🔇 Additional comments (1)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py (1)
1006-1070: web_select_combobox implementation matches requirements and tests.
- Respects optional timeout, reusing the timeout for both input and dropdown lookups.
- Handles missing
aria-controlsand no-matching-- cases with concise, localized TimeoutErrors and appropriate logging.
- Uses JSON-escaped value and a normalization function in JS to robustly match displayed labels (whitespace/case tolerant), and returns the dropdown element after a second sleep.
No changes needed here from my side.
Addresses final review feedback: - Localize all error messages in __set_special_attributes using _() - Add web_sleep assertions to combobox tests for consistency - Add tests for web_select to improve patch coverage: - test_web_select_by_value: successful option selection - test_web_select_raises_on_missing_option: error handling These changes improve i18n compliance and increase test coverage from 69.23% to ensure proper exception handling in web_select. All 527 tests passing (+2 new tests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
src/kleinanzeigen_bot/__init__.py(1 hunks)tests/unit/test_web_scraping_mixin.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
src/kleinanzeigen_bot/**/*.py
⚙️ CodeRabbit configuration file
src/kleinanzeigen_bot/**/*.py: CRITICAL RULES FOR KLEINANZEIGEN BOT:
- ALL code, comments, and text MUST be in English
- User-facing messages MUST use translation system (_()) function
- NEVER access live website in tests (bot detection risk)
- Use WebScrapingMixin for browser automation
- Handle TimeoutError for all web operations
- Use ensure() for critical validations
- Don't add features until explicitly needed
- Keep solutions simple and straightforward
- Use async/await for I/O operations
- Follow Pydantic model patterns
- Use proper error handling and logging
- Test business logic separately from web scraping
- Include SPDX license headers on all Python files
- Use type hints for all function parameters and return values
- Use structured logging with context
Files:
src/kleinanzeigen_bot/__init__.py
tests/**/*.py
⚙️ CodeRabbit configuration file
tests/**/*.py: TESTING RULES:
- NEVER access live website in tests (bot detection risk)
- Use @patch for web operations in tests
- Use test fixtures for browser automation
- Test Pydantic models without web scraping
- Mock all web operations in tests
- Use pytest markers: unit, integration, smoke
- Unit tests: fast, isolated, no external dependencies
- Integration tests: use mocks, test with external dependencies
- Smoke tests: critical path, no mocks, no browser (NOT E2E tests)
- All test code must be in English
- Test observable behavior, not implementation
- Use fakes/dummies instead of mocks in smoke tests
- Focus on minimal health checks, not full user workflows
- Include SPDX license headers
- Use descriptive test names in English
Files:
tests/unit/test_web_scraping_mixin.py
🧬 Code graph analysis (2)
src/kleinanzeigen_bot/__init__.py (1)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py (6)
web_find(729-745)By(75-81)web_select(956-1004)web_click(636-646)web_select_combobox(1006-1070)web_input(852-863)
tests/unit/test_web_scraping_mixin.py (1)
src/kleinanzeigen_bot/utils/web_scraping_mixin.py (7)
web_find(729-745)web_sleep(899-903)web_select_combobox(1006-1070)By(75-81)web_check(589-634)web_await(560-587)web_select(956-1004)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: build (windows-latest, 3.14, true)
- GitHub Check: build (macos-15-intel, 3.14, true)
- GitHub Check: build (macos-15-intel, 3.10, false)
- GitHub Check: build (macos-latest, 3.10, false)
- GitHub Check: build (ubuntu-latest, 3.14, true)
- GitHub Check: analyze
- GitHub Check: build (ubuntu-latest, 3.10, false)
- GitHub Check: build (macos-latest, 3.14, true)
- GitHub Check: build (windows-latest, 3.10, false)
🔇 Additional comments (2)
src/kleinanzeigen_bot/__init__.py (1)
1068-1077: ID fallback and localized TimeoutError are aligned with guidelinesThe fallback from name-based XPath to ID lookup is straightforward and the raised TimeoutError now uses
_()with a short English message, which matches the project’s translation and simplicity rules. No changes needed here.tests/unit/test_web_scraping_mixin.py (1)
170-281: Combobox and select tests accurately cover the new behaviorsThe new async tests for
web_select_comboboxandweb_select:
- Exercise all key paths (missing
aria-controls, successful option click, no matching<li>, special-character labels, select-by-value success, and missing option error).- Correctly stub web operations (
web_find,web_sleep,web_check,web_await,Element.apply) without touching real browsers or networks.- Assert meaningful observables (sleep counts, sent text,
applycalls, and localized TimeoutError messages) while keeping the setup simple.This is a solid, maintainable test addition that mirrors the mixin’s logic well.
The exception handler at line 1094 previously logged "is not of kind radio button" which was misleading since the try block actually attempts select, checkbox, combobox, and text input. Changed to generic message "Failed to set attribute field '%s' via known input types." to accurately reflect the failure condition. Added German translation for the new message.
|
@bjspi Please confirm that the current state is still working for you. |
|
@Heavenfighter since I committed last, can you please approve this PR. |
ℹ️ Description
Added Webselect-Function for Input/Dropdown Combobox
PR for issue/missing feature #677
Fixes / Enhancements
Finding Special Attributes Elements can fail because they are currently only selected using the name="..." attributes of the HTML elements. If it fails, ALSO fallback-handle selecting special attribute HTML elements by ID instead / additionally. (For example the "brands" Input/Combobox for Mens Shoes...
When trying to select a Value in a , it does not only rely on the actual Option value (xxx in the example yyy</...>) but instead also on the displayed HTML value (i.e. yyy in above example). This improves UX because the User doesnt have to check the actual "value" of the Option but instead can check the displayed Value from the Browsers Display directly. Testcases for Webselect_Combobox were not added due to missing knowledge about Async Mocking properly. 📋 Changes Summary ✅ Fixes & Enhancements New WebSelect Functionality Improved Element Detection for Special Attributes Enhanced Option Matching Logic
This improves UX and test robustness — users no longer need to know the exact underlying value, as matching also works with the visible label shown in the browser.
🧩 Result
These updates make dropdown and combobox interactions more intuitive, resilient, and user-friendly across diverse HTML structures.
⚙️ Type of Change
Select the type(s) of change(s) included in this pull request:
✅ Checklist
Before requesting a review, confirm the following:
pdm run test).pdm run format).pdm run lint).By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Summary by CodeRabbit
Bug Fixes
New Features
Tests