source-quickbooks: new connector by nicolaslazo · Pull Request #3468 · estuary/connectors

nicolaslazo · 2025-11-03T12:20:18Z

Description:

This PR adds a new connector for the QuickBooks accounting service. The collected objects are

Account
Bill Payment
Budget
Bill
Class
Credit Memo
Customer
Department
Deposit
Employee
Estimate
Invoice
Item
Journal Entry
Payment
Payment Method
Purchase
Purchase Order
Refund Receipt
Sales Receipt
Tax Agency
Tax Code
Tax Rate
Term
Time Activity
Transfer
Vendor Credit
Vendor

Closes #3437

Workflow steps:

(How does one use this feature, and how has it changed)

Documentation links affected:

A documentation page will have to be prepared.

Notes for reviewers:

PR was tested locally on the Intuit developer sandbox
Source uses a rotating refresh token system so it's not possible to create discovery or snapshot tests

Copilot

Pull Request Overview

This PR introduces a new QuickBooks source connector for the Estuary Flow platform. The connector implements OAuth2 authentication, supports 28 different QuickBooks entity types, and provides both backfill and incremental sync capabilities.

Key changes:

Implementation of QuickBooks API connector with OAuth2 authentication
Support for 28 QuickBooks entities including Accounts, Invoices, Customers, etc.
Backfill and incremental data sync with configurable window sizes
Snapshot testing setup for connector spec validation

Reviewed Changes

Copilot reviewed 39 out of 41 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
source_quickbooks/init.py	Main connector class implementing spec, discover, validate, and open methods
source_quickbooks/models.py	Data models for all 28 QuickBooks entities and endpoint configuration
source_quickbooks/api.py	API interaction logic for querying, backfilling, and fetching entities
source_quickbooks/resources.py	Resource configuration and credential validation logic
tests/test_snapshots.py	Snapshot test for connector spec validation
test.flow.yaml	Flow configuration file for testing all resource bindings
acmeCo/*.schema.yaml	JSON schemas for all QuickBooks entity collections
pyproject.toml	Project dependencies and configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-03T12:22:56Z

+    try:
+        await anext(
+            query_entity(
+                Account, EPOCH, datetime.now(tz=UTC), http, config.realm_id, log
+            )
+        )
+    except HTTPError as err:
+        msg = f"Encountered error validating credentials.\n\n{err.message}"
+        if err.code == 401:
+            msg = f"Invalid credentials. Please confirm the provided credentials are correct.\n\n{err.message}"
+
+        raise ValidationError([msg])


The validate_credentials function only catches HTTPError but doesn't handle the case where the query returns zero results. The anext() call will raise StopAsyncIteration if the Account query returns no results, which would not be caught and would propagate as an unhandled exception rather than a ValidationError.

Copilot · 2025-11-03T12:22:56Z

+                inc=common.ResourceState.Incremental(cursor=cutoff),
+            ),
+            initial_config=ResourceConfig(
+                name=resource.resource_name, interval=timedelta(minutes=5)


The default interval of 5 minutes is hardcoded here and duplicated across all resources in test.flow.yaml (PT5M). Consider defining this as a constant (e.g., DEFAULT_RESOURCE_INTERVAL) to ensure consistency and ease of maintenance.

Copilot · 2025-11-03T12:22:57Z

+from estuary_cdk.capture.common import (
+    BaseDocument,
+    ResourceConfig,
+    ResourceState,
+)


Import of 'ResourceConfig' is not used.

nicolaslazo · 2025-11-03T12:23:00Z

+def dt_to_ts(dt: AwareDatetime) -> str:
+    return dt.isoformat(timespec="seconds").replace("+00:00", "Z")


I'm adding source-specific timestamp handling logic to the standard utility function dt_to_ts, which might be counterintuitive. Let me know if you'd like it changed

As long as this transformation is generally useful and how we'd want to transform all datetimes to strings in source-quickbooks, I think it's fine.

Alex-Bair

Had a question about the operators used in query_entity.

Also, the connector will need added to the CI process in .github/workflows/python.yaml so it gets built and deployed.

Alex-Bair · 2025-11-03T21:54:28Z

+def dt_to_ts(dt: AwareDatetime) -> str:
+    return dt.isoformat(timespec="seconds").replace("+00:00", "Z")


As long as this transformation is generally useful and how we'd want to transform all datetimes to strings in source-quickbooks, I think it's fine.

Alex-Bair

After our discussion this morning, I had a few more comments around the backfill strategy & what I think caused those 401/403 errors you mentioned.

LMK if anything isn't clear or you have questions about any of my comments.

Alex-Bair · 2025-11-04T18:29:14Z

+    },
+    accessTokenHeaders={
+        "Content-Type": "application/x-www-form-urlencoded",
+        "Authorization": r"Basic {{#basicauth}}{{{ client_id }}}:{{{ client_secret }}}{{/basicauth}}",


Nice job figuring out how to use a template to place the client id and secret in an Authorization header during the access token request! This template is used by the access-token.ts Supabase function to fetch the initial set of tokens when users press the "AUTHENTICATE WITH INTUIT" button in the UI. However, this template does not affect how token exchange is performed within the connector, and I think that may explain the 401/403 errors you were seeing during testing.

The TokenSource._fetch_oauth2_token function is where the headers & body for token exchange requests are constructed and used. Right now, connectors cannot control whether the client id and secret are used in an Authorization header or in the body; that's hardcoded based on the OAuth2 credentials class. Since the RotatingOAuth2Credentials always puts the client id and secret in the request body, I'd anticipate the token exchange in TokenSource._fetch_oauth2_token is failing for source-quickbooks since Intuit expects the client id and secret to be base64 encoded in an Authorization header.

What we want is to allow connectors to choose whether the client id and secret gets placed in an Authorization header or the request body within TokenSource._fetch_oauth2_token regardless of what type of OAuth2 credentials class is used. Could you take a crack at figuring out how to do that? This would be a relatively contained change that'd be a good introduction to working in the CDK.

Alex-Bair · 2025-11-04T19:11:07Z

+async def backfill_entity(
+    model: Type[T],
+    http: HTTPSession,
+    realm_id: str,
+    window_size: timedelta,
+    log: Logger,
+    page: PageCursor,
+    cutoff: LogCursor,
+) -> AsyncGenerator[T | PageCursor, None]:
+    assert isinstance(page, str)
+    assert isinstance(cutoff, datetime)
+
+    page_ts = datetime.fromisoformat(page)
+    max_date_to_fetch = min(page_ts + window_size, cutoff)
+
+    log.info(
+        "Initiating backfill fetch",
+        {
+            "page_ts": page_ts,
+            "max_date_to_fetch": max_date_to_fetch,
+            "window_size": window_size,
+            "cutoff": cutoff,
+        },
+    )
+
+    if page_ts >= cutoff:
+        return
+
+    async for item in query_entity(
+        model, page_ts, max_date_to_fetch, http, realm_id, log
+    ):
+        yield item
+
+    yield max_date_to_fetch.isoformat(timespec="seconds")


What's the motivation for using a window_size during the backfill? Based on query_entity, it looks like we can both filter and sort the results returned from the API, meaning we could request all entities between some start and the cutoff and treat the items from query_entity as a stream of ordered results.

Instead of potentially requiring users to fiddle with the window_size, I'd prefer if we used a backfill strategy like we're doing in source-klaviyo-native's backfill_incremental_resources. Klaviyo's API is similar; we can filter and sort results so the connector receives results in ascending order of some cursor field. The connector iterates through the results, yielding them and checkpointing when it's safe to do so (i.e. we've captured all results at that cursor value and eariler). And if the backfill_incremental_resources function runs for more than 5 minutes, it exits after its next checkpoint, then the CDK will re-invoke with the most recent checkpoint passed in as page.

For additional context, we've usually used date windows when the results from the API aren't sorted & it's unlikely the fetch_page function could process all results in a reasonable time frame. We use date windows to split the backfill range into smaller, more manageable chunks, then checkpoint each chunk as it's completed.

If the API supports sorting results in ascending order and there's not some other API limitation that makes date windows more appealing, we usually try to avoid using date windows

nicolaslazo · 2025-11-17T16:47:45Z

+    # Though technically a numerical value, realm IDs tend to be large enough that precision loss
+    # can alter what's actually used.
+    # This value is presented to end users as their company ID
+    realm_id: str = Field(
+        description="ID for the Company to Request Data From",
+        title="Company ID",
+        json_schema_extra={"order": 0},
+    )


I tried to parse this from QuickBook's OAuth flow, but it looks like the CDK doesn't support reading URL parameters from redirect URI requests yet

Alex-Bair

LGTM %:

Merging #3503 next week and rebasing this PR after that one is merged. We'll want this PR to end up having the source-quickbooks change without of all the CDK changes that'll get merged with the other PR.
Adding a config.yaml (or just some mock credentials directly in test.flow.yaml) so CI checks run & pass.

Alex-Bair · 2025-11-26T15:50:51Z

+          - python
+          - "-m"
+          - source_quickbooks
+        config: config.yaml


It looks like there's no config.yaml file with some mock credentials, and that's causing the checks in CI to fail.

Alex-Bair · 2025-11-26T15:53:38Z

+
+        yield item
+
+    yield cutoff.isoformat(timespec="seconds")


At the end of backfill_entity, instead of:

yielding the cutoff

reinvoking backfill_entity with the cutoff as the page argument

immediately returning

we could just return. Both signal that the backfill is finished, but the latter has fewer steps.

nicolaslazo requested review from Alex-Bair and Copilot November 3, 2025 12:20

nicolaslazo self-assigned this Nov 3, 2025

Copilot AI reviewed Nov 3, 2025

View reviewed changes

nicolaslazo commented Nov 3, 2025

View reviewed changes

nicolaslazo force-pushed the nlazo/source-quickbooks branch from 17bc996 to 5e35804 Compare November 3, 2025 12:38

Alex-Bair reviewed Nov 3, 2025

View reviewed changes

nicolaslazo force-pushed the nlazo/source-quickbooks branch from 5e35804 to 4ad8a6a Compare November 4, 2025 15:17

nicolaslazo requested a review from Alex-Bair November 4, 2025 15:36

Alex-Bair reviewed Nov 4, 2025

View reviewed changes

nicolaslazo mentioned this pull request Nov 5, 2025

docs: source-quickbooks estuary/flow#2490

Merged

nicolaslazo mentioned this pull request Nov 16, 2025

estuary-cdk: refactor the placement of client creds in OAuth requests #3503

Merged

nicolaslazo force-pushed the nlazo/source-quickbooks branch 2 times, most recently from 01afe58 to 5aa95c5 Compare November 17, 2025 16:45

nicolaslazo commented Nov 17, 2025

View reviewed changes

nicolaslazo requested a review from Alex-Bair November 17, 2025 16:48

nicolaslazo force-pushed the nlazo/source-quickbooks branch 3 times, most recently from eb1f1b3 to 52cb681 Compare November 18, 2025 12:58

Alex-Bair approved these changes Nov 26, 2025

View reviewed changes

nicolaslazo force-pushed the nlazo/source-quickbooks branch 3 times, most recently from 4fde7db to c73f4be Compare December 2, 2025 11:34

source-quickbooks: new connector

4c76882

nicolaslazo force-pushed the nlazo/source-quickbooks branch from c73f4be to 4c76882 Compare December 2, 2025 11:55

nicolaslazo merged commit 0facc2f into main Dec 2, 2025
103 of 114 checks passed

nicolaslazo deleted the nlazo/source-quickbooks branch December 11, 2025 18:40

		def dt_to_ts(dt: AwareDatetime) -> str:
		return dt.isoformat(timespec="seconds").replace("+00:00", "Z")

Conversation

nicolaslazo commented Nov 3, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Alex-Bair left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Alex-Bair left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Alex-Bair left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants