Sac 28753/upgrade python dependencies by atttiwari · Pull Request #17 · singer-io/tap-mailshake

atttiwari · 2026-03-31T09:24:06Z

Description of change

Upgraded tap-mailshake to Python 3.12, updated core dependencies , and migrated CI to uv. Added comprehensive unit tests across all major modules. Integration tests temporarily disabled pending schema/metadata fixes tracked in a separate bug.
SAC-28753

Manual QA steps

Discovery: running
Sync: running
Unit Tests: running
Integration test: not running

Risks

Rollback steps

revert this branch

AI generated code

https://internal.qlik.dev/general/ways-of-working/code-reviews/#guidelines-for-ai-generated-code

this PR has been written with the help of GitHub Copilot or another generative AI tool

…ns tests

…ed setup.py version

Copilot

Pull request overview

This PR upgrades the tap to run on Python 3.12, refreshes core pinned dependencies, migrates CircleCI to use uv, and adds a new unit-test suite covering core tap modules (client, discovery, schema loading, streams config, sync helpers, and record transforms). Integration-style tap-tester tests are added but CI execution is currently commented out.

Changes:

Upgrade CircleCI environment setup to Python 3.12 and switch unit test runner from nosetests to pytest + coverage.
Bump package version and update pinned runtime dependencies (requests, singer-python).
Add a set of unit tests under tests/unittests/ plus tap-tester integration test scaffolding under tests/.

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`tests/unittests/test_transform.py`	Adds unit tests for record enrichment and key reformatting helpers.
`tests/unittests/test_sync.py`	Adds unit tests for bookmark helpers, datetime transform, and singer write wrappers.
`tests/unittests/test_streams.py`	Validates `STREAMS` structure and `flatten_streams()` behavior.
`tests/unittests/test_schema.py`	Tests schema/metadata loading and stream/schema alignment.
`tests/unittests/test_discover.py`	Verifies `discover()` returns a well-formed Singer catalog.
`tests/unittests/test_client.py`	Adds unit tests for client timeout/retry and rate-limit behavior.
`tests/unittests/test_client_errors.py`	Adds unit tests for error-code-to-exception mapping and client init/get/post helpers.
`tests/base.py`	Introduces a shared tap-tester base case with expected streams/metadata.
`tests/test_discovery.py`	Adds tap-tester discovery integration test.
`tests/test_all_fields.py`	Adds tap-tester “all fields selected” integration test.
`tests/test_automatic_fields.py`	Adds tap-tester “automatic fields only” integration test.
`tests/test_bookmark.py`	Adds tap-tester bookmark advancement integration test.
`tests/test_start_date.py`	Adds tap-tester start_date behavior integration test.
`tests/test_pagination.py`	Adds tap-tester pagination integration test scaffold.
`setup.py`	Bumps version to `0.2.0` and updates pinned dependencies.
`.circleci/config.yml`	Moves CI venv to Python 3.12, switches unit tests to pytest/coverage, comments out integration tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-31T09:29:49Z

tests/unittests/test_client_errors.py

+        for code, exc_class in ERROR_CODE_EXCEPTION_MAPPING.items():
+            self.assertIsNot(
+                exc_class, MailshakeError,
+                msg=f"Code '{code}' should map to a specific subclass, not base MailshakeError"
+            )


test_all_mapped_codes_return_non_base_exception repeats the same loop/assertion twice (lines 32-41). This is redundant and should be collapsed to a single loop to keep the test intent clear.

Suggested change

for code, exc_class in ERROR_CODE_EXCEPTION_MAPPING.items():

self.assertIsNot(

exc_class, MailshakeError,

msg=f"Code '{code}' should map to a specific subclass, not base MailshakeError"

)

Copilot · 2026-03-31T09:29:49Z

.circleci/config.yml

          name: 'Unit Tests'
          command: |
            source /usr/local/share/virtualenvs/tap-mailshake/bin/activate
-            uv pip install nose coverage parameterized
-            nosetests --with-coverage --cover-erase --cover-package=tap_mailshake --cover-html-dir=htmlcov tap_mailshake/tests/unittests
+            uv pip install pytest coverage parameterized
+            coverage run -m pytest tests/unittests
            coverage html
      - store_test_results:
          path: test_output/report.xml
      - store_artifacts:


The store_test_results step uploads test_output/report.xml, but the preceding command runs pytest without generating a JUnit XML report and the directory/file isn't created. Configure pytest to emit JUnit XML at that path (and create the directory), or remove/update the store_test_results path to match what the job actually produces.

not required

Copilot · 2026-03-31T09:29:50Z

tests/test_automatic_fields.py

+        # Select only automatic fields for each stream
+        for catalog_entry in found_catalogs:
+            stream = catalog_entry["stream_name"]
+            stream_id = catalog_entry["tap_stream_id"]
+            schema = catalog_entry.get("schema", {})
+
+            # Deselect all non-automatic fields
+            automatic_fields = (
+                self.expected_primary_keys().get(stream, set()) |
+                self.expected_replication_keys().get(stream, set())
+            )
+            non_selected = {
+                f for f in schema.get("properties", {}).keys()
+                if f not in automatic_fields
+            }
+            self.select_all_streams_and_fields(conn_id, found_catalogs, select_all_fields=False)


This test computes stream_id and non_selected but never uses them, and it doesn't actually deselect any non-automatic fields before calling menagerie.select_catalog. As written, it won't validate the "automatic fields only" behavior described in the docstring. Update the selection logic to explicitly deselect non-automatic fields (using the tap-tester/menagerie selection APIs) and remove unused variables.

Suggested change

# Select only automatic fields for each stream

for catalog_entry in found_catalogs:

stream = catalog_entry["stream_name"]

stream_id = catalog_entry["tap_stream_id"]

schema = catalog_entry.get("schema", {})

# Deselect all non-automatic fields

automatic_fields = (

self.expected_primary_keys().get(stream, set()) |

self.expected_replication_keys().get(stream, set())

)

non_selected = {

f for f in schema.get("properties", {}).keys()

if f not in automatic_fields

}

self.select_all_streams_and_fields(conn_id, found_catalogs, select_all_fields=False)

# First ensure streams are selected; we'll then restrict fields to automatic ones.

self.select_all_streams_and_fields(conn_id, found_catalogs, select_all_fields=False)

# Select only automatic fields for each stream

for catalog_entry in found_catalogs:

stream = catalog_entry["stream_name"]

schema = catalog_entry.get("schema", {})

# Identify automatic fields (PKs + replication keys)

automatic_fields = (

self.expected_primary_keys().get(stream, set()) |

self.expected_replication_keys().get(stream, set())

)

# Explicitly select only automatic fields in the catalog metadata

for metadata_entry in catalog_entry.get("metadata", []):

breadcrumb = metadata_entry.get("breadcrumb", [])

if len(breadcrumb) == 2 and breadcrumb[0] == "properties":

field_name = breadcrumb[1]

metadata = metadata_entry.get("metadata", {})

metadata["selected"] = field_name in automatic_fields

metadata_entry["metadata"] = metadata

integration test would be fix after metadata issue fixed

Copilot · 2026-03-31T09:29:50Z

tests/test_pagination.py

+        # At least some streams should have records (pagination is only verifiable
+        # in real integration context — this asserts we retrieved what exists)
+        total_records = sum(record_count_by_stream.values())
+        self.assertGreaterEqual(
+            total_records, 0,
+            msg="Sync should complete without errors and return records where they exist"


test_pagination_returns_multiple_pages does not currently test pagination: PAGE_SIZE is unused and assertGreaterEqual(total_records, 0) is always true. To make this meaningful, assert that at least one paginated stream returns > PAGE_SIZE records (or skip the test when the account has insufficient data) and/or verify multiple page fetches via observable output.

Suggested change

# At least some streams should have records (pagination is only verifiable

# in real integration context — this asserts we retrieved what exists)

total_records = sum(record_count_by_stream.values())

self.assertGreaterEqual(

total_records, 0,

msg="Sync should complete without errors and return records where they exist"

# Verify that at least one stream returns more than a single page of results.

# If the account does not have enough data, skip rather than fail.

max_records = max(record_count_by_stream.values() or [0])

if max_records <= PAGE_SIZE:

self.skipTest(

f"No stream returned more than {PAGE_SIZE} records; "

"insufficient data to verify pagination behavior."

)

self.assertTrue(

any(count > PAGE_SIZE for count in record_count_by_stream.values()),

msg=(

"Expected at least one stream to return more than one page of results "

f"(page size {PAGE_SIZE}), but none did."

),

integration test would be fix after metadata issue fixed

Copilot · 2026-03-31T09:29:50Z

tests/test_all_fields.py

+
+        for stream in self.expected_streams():
+            with self.subTest(stream=stream):
+                self.assertGreaterEqual(


The assertion record_count_by_stream.get(stream, 0) >= 0 is a tautology, so this test can pass even if the tap produced no records for a stream. If zero records is acceptable, consider asserting something non-trivial instead (e.g., that record_count_by_stream contains all expected streams, or that the sync produced the expected Singer message types / output entries).

Suggested change

self.assertGreaterEqual(

self.assertGreater(

integration test would be fix after metadata issue fixed

Copilot · 2026-03-31T09:29:50Z

setup.py

 setup(name='tap-mailshake',
-      version='0.1.0',
+      version='0.2.0',
      description='Singer.io tap for extracting data from the Mailshake API',
      author='[email protected]',
      classifiers=['Programming Language :: Python :: 3 :: Only'],
      py_modules=['tap_mailshake'],
      install_requires=[
          'backoff==2.2.1',
-          'requests==2.32.5',
-          'singer-python==6.4.0'
+          'requests==2.33.0',
+          'singer-python==6.8.0'
      ],


Given the stated upgrade to Python 3.12 (and CI now running 3.12), setup.py should declare the supported interpreter range (e.g., python_requires and/or a Programming Language :: Python :: 3.12 classifier). Without this, pip may install the package on unsupported Python versions and fail at runtime.

not required

atttiwari added 3 commits March 30, 2026 14:38

SAC-28753: Upgrade modules in setup.py, created unittests, integratio…

94e9337

…ns tests

SAC-28753: Upgraded python version in .circleci/config.yml and upgrad…

99face4

…ed setup.py version

SAC-28753: disabeling integration tests will fix in 30507 later

9f05ac7

atttiwari requested a review from Copilot March 31, 2026 09:24

Copilot started reviewing on behalf of atttiwari March 31, 2026 09:25 View session

Copilot AI reviewed Mar 31, 2026

View reviewed changes

SAC-28753: Updated changelog.md and addressed copilit suggestions

2e9bfd7

atttiwari requested a review from RushiT0122 April 1, 2026 10:55

-        # At least some streams should have records (pagination is only verifiable
-        # in real integration context — this asserts we retrieved what exists)
-        total_records = sum(record_count_by_stream.values())
-        self.assertGreaterEqual(
-            total_records, 0,
-            msg="Sync should complete without errors and return records where they exist"
+        # Verify that at least one stream returns more than a single page of results.
+        # If the account does not have enough data, skip rather than fail.
+        max_records = max(record_count_by_stream.values() or [0])
+        if max_records <= PAGE_SIZE:
+            self.skipTest(
+                f"No stream returned more than {PAGE_SIZE} records; "
+                "insufficient data to verify pagination behavior."
+            )
+        self.assertTrue(
+            any(count > PAGE_SIZE for count in record_count_by_stream.values()),
+            msg=(
+                "Expected at least one stream to return more than one page of results "
+                f"(page size {PAGE_SIZE}), but none did."
+            ),

Conversation

atttiwari commented Mar 31, 2026

Description of change

Manual QA steps

Risks

Rollback steps

AI generated code

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

atttiwari Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

atttiwari Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

atttiwari Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

atttiwari Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

atttiwari Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

atttiwari Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants