Conversation
There was a problem hiding this comment.
Pull request overview
This PR upgrades the tap to run on Python 3.12, refreshes core pinned dependencies, migrates CircleCI to use uv, and adds a new unit-test suite covering core tap modules (client, discovery, schema loading, streams config, sync helpers, and record transforms). Integration-style tap-tester tests are added but CI execution is currently commented out.
Changes:
- Upgrade CircleCI environment setup to Python 3.12 and switch unit test runner from
noseteststopytest+coverage. - Bump package version and update pinned runtime dependencies (
requests,singer-python). - Add a set of unit tests under
tests/unittests/plus tap-tester integration test scaffolding undertests/.
Reviewed changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
tests/unittests/test_transform.py |
Adds unit tests for record enrichment and key reformatting helpers. |
tests/unittests/test_sync.py |
Adds unit tests for bookmark helpers, datetime transform, and singer write wrappers. |
tests/unittests/test_streams.py |
Validates STREAMS structure and flatten_streams() behavior. |
tests/unittests/test_schema.py |
Tests schema/metadata loading and stream/schema alignment. |
tests/unittests/test_discover.py |
Verifies discover() returns a well-formed Singer catalog. |
tests/unittests/test_client.py |
Adds unit tests for client timeout/retry and rate-limit behavior. |
tests/unittests/test_client_errors.py |
Adds unit tests for error-code-to-exception mapping and client init/get/post helpers. |
tests/base.py |
Introduces a shared tap-tester base case with expected streams/metadata. |
tests/test_discovery.py |
Adds tap-tester discovery integration test. |
tests/test_all_fields.py |
Adds tap-tester “all fields selected” integration test. |
tests/test_automatic_fields.py |
Adds tap-tester “automatic fields only” integration test. |
tests/test_bookmark.py |
Adds tap-tester bookmark advancement integration test. |
tests/test_start_date.py |
Adds tap-tester start_date behavior integration test. |
tests/test_pagination.py |
Adds tap-tester pagination integration test scaffold. |
setup.py |
Bumps version to 0.2.0 and updates pinned dependencies. |
.circleci/config.yml |
Moves CI venv to Python 3.12, switches unit tests to pytest/coverage, comments out integration tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for code, exc_class in ERROR_CODE_EXCEPTION_MAPPING.items(): | ||
| self.assertIsNot( | ||
| exc_class, MailshakeError, | ||
| msg=f"Code '{code}' should map to a specific subclass, not base MailshakeError" | ||
| ) |
There was a problem hiding this comment.
test_all_mapped_codes_return_non_base_exception repeats the same loop/assertion twice (lines 32-41). This is redundant and should be collapsed to a single loop to keep the test intent clear.
| for code, exc_class in ERROR_CODE_EXCEPTION_MAPPING.items(): | |
| self.assertIsNot( | |
| exc_class, MailshakeError, | |
| msg=f"Code '{code}' should map to a specific subclass, not base MailshakeError" | |
| ) |
| name: 'Unit Tests' | ||
| command: | | ||
| source /usr/local/share/virtualenvs/tap-mailshake/bin/activate | ||
| uv pip install nose coverage parameterized | ||
| nosetests --with-coverage --cover-erase --cover-package=tap_mailshake --cover-html-dir=htmlcov tap_mailshake/tests/unittests | ||
| uv pip install pytest coverage parameterized | ||
| coverage run -m pytest tests/unittests | ||
| coverage html | ||
| - store_test_results: | ||
| path: test_output/report.xml | ||
| - store_artifacts: |
There was a problem hiding this comment.
The store_test_results step uploads test_output/report.xml, but the preceding command runs pytest without generating a JUnit XML report and the directory/file isn't created. Configure pytest to emit JUnit XML at that path (and create the directory), or remove/update the store_test_results path to match what the job actually produces.
| # Select only automatic fields for each stream | ||
| for catalog_entry in found_catalogs: | ||
| stream = catalog_entry["stream_name"] | ||
| stream_id = catalog_entry["tap_stream_id"] | ||
| schema = catalog_entry.get("schema", {}) | ||
|
|
||
| # Deselect all non-automatic fields | ||
| automatic_fields = ( | ||
| self.expected_primary_keys().get(stream, set()) | | ||
| self.expected_replication_keys().get(stream, set()) | ||
| ) | ||
| non_selected = { | ||
| f for f in schema.get("properties", {}).keys() | ||
| if f not in automatic_fields | ||
| } | ||
| self.select_all_streams_and_fields(conn_id, found_catalogs, select_all_fields=False) |
There was a problem hiding this comment.
This test computes stream_id and non_selected but never uses them, and it doesn't actually deselect any non-automatic fields before calling menagerie.select_catalog. As written, it won't validate the "automatic fields only" behavior described in the docstring. Update the selection logic to explicitly deselect non-automatic fields (using the tap-tester/menagerie selection APIs) and remove unused variables.
| # Select only automatic fields for each stream | |
| for catalog_entry in found_catalogs: | |
| stream = catalog_entry["stream_name"] | |
| stream_id = catalog_entry["tap_stream_id"] | |
| schema = catalog_entry.get("schema", {}) | |
| # Deselect all non-automatic fields | |
| automatic_fields = ( | |
| self.expected_primary_keys().get(stream, set()) | | |
| self.expected_replication_keys().get(stream, set()) | |
| ) | |
| non_selected = { | |
| f for f in schema.get("properties", {}).keys() | |
| if f not in automatic_fields | |
| } | |
| self.select_all_streams_and_fields(conn_id, found_catalogs, select_all_fields=False) | |
| # First ensure streams are selected; we'll then restrict fields to automatic ones. | |
| self.select_all_streams_and_fields(conn_id, found_catalogs, select_all_fields=False) | |
| # Select only automatic fields for each stream | |
| for catalog_entry in found_catalogs: | |
| stream = catalog_entry["stream_name"] | |
| schema = catalog_entry.get("schema", {}) | |
| # Identify automatic fields (PKs + replication keys) | |
| automatic_fields = ( | |
| self.expected_primary_keys().get(stream, set()) | | |
| self.expected_replication_keys().get(stream, set()) | |
| ) | |
| # Explicitly select only automatic fields in the catalog metadata | |
| for metadata_entry in catalog_entry.get("metadata", []): | |
| breadcrumb = metadata_entry.get("breadcrumb", []) | |
| if len(breadcrumb) == 2 and breadcrumb[0] == "properties": | |
| field_name = breadcrumb[1] | |
| metadata = metadata_entry.get("metadata", {}) | |
| metadata["selected"] = field_name in automatic_fields | |
| metadata_entry["metadata"] = metadata |
There was a problem hiding this comment.
integration test would be fix after metadata issue fixed
| # At least some streams should have records (pagination is only verifiable | ||
| # in real integration context — this asserts we retrieved what exists) | ||
| total_records = sum(record_count_by_stream.values()) | ||
| self.assertGreaterEqual( | ||
| total_records, 0, | ||
| msg="Sync should complete without errors and return records where they exist" |
There was a problem hiding this comment.
test_pagination_returns_multiple_pages does not currently test pagination: PAGE_SIZE is unused and assertGreaterEqual(total_records, 0) is always true. To make this meaningful, assert that at least one paginated stream returns > PAGE_SIZE records (or skip the test when the account has insufficient data) and/or verify multiple page fetches via observable output.
| # At least some streams should have records (pagination is only verifiable | |
| # in real integration context — this asserts we retrieved what exists) | |
| total_records = sum(record_count_by_stream.values()) | |
| self.assertGreaterEqual( | |
| total_records, 0, | |
| msg="Sync should complete without errors and return records where they exist" | |
| # Verify that at least one stream returns more than a single page of results. | |
| # If the account does not have enough data, skip rather than fail. | |
| max_records = max(record_count_by_stream.values() or [0]) | |
| if max_records <= PAGE_SIZE: | |
| self.skipTest( | |
| f"No stream returned more than {PAGE_SIZE} records; " | |
| "insufficient data to verify pagination behavior." | |
| ) | |
| self.assertTrue( | |
| any(count > PAGE_SIZE for count in record_count_by_stream.values()), | |
| msg=( | |
| "Expected at least one stream to return more than one page of results " | |
| f"(page size {PAGE_SIZE}), but none did." | |
| ), |
There was a problem hiding this comment.
integration test would be fix after metadata issue fixed
|
|
||
| for stream in self.expected_streams(): | ||
| with self.subTest(stream=stream): | ||
| self.assertGreaterEqual( |
There was a problem hiding this comment.
The assertion record_count_by_stream.get(stream, 0) >= 0 is a tautology, so this test can pass even if the tap produced no records for a stream. If zero records is acceptable, consider asserting something non-trivial instead (e.g., that record_count_by_stream contains all expected streams, or that the sync produced the expected Singer message types / output entries).
| self.assertGreaterEqual( | |
| self.assertGreater( |
There was a problem hiding this comment.
integration test would be fix after metadata issue fixed
| setup(name='tap-mailshake', | ||
| version='0.1.0', | ||
| version='0.2.0', | ||
| description='Singer.io tap for extracting data from the Mailshake API', | ||
| author='[email protected]', | ||
| classifiers=['Programming Language :: Python :: 3 :: Only'], | ||
| py_modules=['tap_mailshake'], | ||
| install_requires=[ | ||
| 'backoff==2.2.1', | ||
| 'requests==2.32.5', | ||
| 'singer-python==6.4.0' | ||
| 'requests==2.33.0', | ||
| 'singer-python==6.8.0' | ||
| ], |
There was a problem hiding this comment.
Given the stated upgrade to Python 3.12 (and CI now running 3.12), setup.py should declare the supported interpreter range (e.g., python_requires and/or a Programming Language :: Python :: 3.12 classifier). Without this, pip may install the package on unsupported Python versions and fail at runtime.
Description of change
Upgraded tap-mailshake to Python 3.12, updated core dependencies , and migrated CI to uv. Added comprehensive unit tests across all major modules. Integration tests temporarily disabled pending schema/metadata fixes tracked in a separate bug.
SAC-28753
Manual QA steps
Risks
Rollback steps
AI generated code
https://internal.qlik.dev/general/ways-of-working/code-reviews/#guidelines-for-ai-generated-code