[Issue #6980] ADR for SGM Data Flow #7683

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

mdragon wants to merge 3 commits into main from mdragon/6980-sgm-data-flow-adr

+97 −56

Collaborator

mdragon commented Dec 24, 2025

Summary

Fixes #6980

Adds an ADR describing the ways we've identified we'll access GS data directly from SGM


          ADR for SGM Data Flow

d7f8067

mdragon requested review from ErinPattisonNava, andycochran, btabaska, chouinar, doug-s-nava, glenblosser-nava, hao10282025-sudo, jakobpederson, joshtonava, kkrug, myduong-navapbc, nathan-stricker, prasnava and widal001 as code owners

December 24, 2025 21:24


          Links go in Summary not README

075aee7

mdragon force-pushed the mdragon/6980-sgm-data-flow-adr branch from 64ed70f to e27ee12 Compare

December 24, 2025 21:40


          Didn't realize we were tracking some ADRs in a second file. Summary i…

460f78f

…s checked by the linter so we know that one's up-to-date

mdragon force-pushed the mdragon/6980-sgm-data-flow-adr branch from e27ee12 to 460f78f Compare

December 24, 2025 21:41

Collaborator Author

mdragon commented Dec 24, 2025

Not sure why the check is failing, the file it's mad about is linked in the Summary, just like all the other pages Brandon added along with it...

doug-s-nava reviewed

View reviewed changes

documentation/wiki/product/decisions/adr/2025-12-24-sgm-data-flow.md


		## Context and Problem Statement

		Given the different approach for modernization of GrantSolutions we need an ongoing interoperability with the existing system, not a strangler, rebuild and replace pattern. To allow for that we need to have strategies for accessing data across the existing and modernized system that were not already designed and built for the Simpler Grants.gov work. How will we allow for bi-directional, near real-time, data flow between GrantSolutions (GS) and Simpler Grants Management (SGM).

Collaborator

doug-s-nava Dec 30, 2025

Assuming this last sentence is a question

How will we allow for bi-directional, near real-time, data flow between GrantSolutions (GS) and Simpler Grants Management (SGM)?

documentation/wiki/product/decisions/adr/2025-12-24-sgm-data-flow.md

+              ## Decision Drivers
+              - Optimize User Experience
+                - User's should be able to move between systems as seamlessly as possible

Collaborator

doug-s-nava Dec 30, 2025

Suggested change

      
              - User's should be able to move between systems as seamlessly as possible
          
              - Users should be able to move between systems as seamlessly as possible

documentation/wiki/product/decisions/adr/2025-12-24-sgm-data-flow.md

+              - Optimize User Experience
+                - User's should be able to move between systems as seamlessly as possible
+                - Workflows or processes should move from system to system as needed without long delays or user intervention

Collaborator

doug-s-nava Dec 30, 2025

does this mean the workflows themselves should move, as in a workflow defined in simpler should be able to be accessed from grant solutions? Or is this more that as part of a user's "workflow" they should be able to track data between the two systems? "Workflow" being a tricky term I think this needs some clarification

documentation/wiki/product/decisions/adr/2025-12-24-sgm-data-flow.md


		### Bulk data copy on a scheduled basis

		This is the approach we took on Simpler Grants.gov. Hourly, we Extract, Load, and Transform (ELT) all of the table data we need from Grants.gov's database into Simpler Grants.gov's database. Whenever possible we only pull records updated since the last run to minimize data volume and the associated load on the existing system. We do not currently send data back to Grants.gov's database but in the SGM work that would be a requirement as well. We would modify our existing processes to support bi-directional data transfer. We could also consider improving the existing code base to allow it to run more frequently without collision and add more filtering to avoid when we fetch rows that we didn't see their FK records and so we fail to create the records (we could just only process something if we've already seen the parent record this run).

Collaborator

doug-s-nava Dec 30, 2025

what are FK records here?

chouinar reviewed

View reviewed changes

documentation/wiki/product/decisions/adr/2025-12-24-sgm-data-flow.md


		### Bulk data copy on a scheduled basis

		This is the approach we took on Simpler Grants.gov. Hourly, we Extract, Load, and Transform (ELT) all of the table data we need from Grants.gov's database into Simpler Grants.gov's database. Whenever possible we only pull records updated since the last run to minimize data volume and the associated load on the existing system. We do not currently send data back to Grants.gov's database but in the SGM work that would be a requirement as well. We would modify our existing processes to support bi-directional data transfer. We could also consider improving the existing code base to allow it to run more frequently without collision and add more filtering to avoid when we fetch rows that we didn't see their FK records and so we fail to create the records (we could just only process something if we've already seen the parent record this run).

Collaborator

chouinar Jan 5, 2026

I'd probably leave out the detail about filtering based on the foreign key issue, while that is an annoying issue, it's probably too specific to call out here.

I will say that our current approach is heavily reliant on grants.gov's method of storing data in their DB (having created/updated timestamps), it wouldn't be possible to replicate if SGM doesn't have that working properly.

Also, bidirectional might not work well at all with our current approach, not only would writing back potentially put us in an endless loop (we write an update to legacy, which now has an updated timestamp and we pull it, and then process that update, and now we have an update to write back to legacy ... ).

Batch processing is likely something we'll want where timeliness isn't a concern, but we might need to start at least partially fresh depending on how SGM works.

documentation/wiki/product/decisions/adr/2025-12-24-sgm-data-flow.md

Comment on lines +68 to +70

		- Cons
		- Still learning what data has existing APIs that will make this possible

Collaborator

chouinar Jan 5, 2026

What happens if a call fails? A con of using APIs might be that we end up in a weird state because an API call failed, but we don't have a way pipe that through? There's also some level of complexity about how to identify that two records across systems representing the same thing, opportunities have 3 different IDs (legacy integers, our UUIDs, and opportunity number which isn't unique, but often what other systems use). Even if we have IDs in sync, those IDs might not even exist to connect records (a legacy integer opportunity ID can't be created anywhere BUT grants.gov adding an entire dependency there).

documentation/wiki/product/decisions/adr/2025-12-24-sgm-data-flow.md


		### Call GS APIs directly as needed, store additional data points in SGM (without duplicating existing data)

		We won't be able to always mutate the GS data model as quickly as we'd want to iterate on SGM. In those cases we would store new fields in the Simpler DB, with the identifier of the record in GS. This would allow the data from both systems to be pulled together either in the API layer or in the FE via 2 API calls depending on whether we're wrapping API calls to GS in the Simpler API.

Collaborator

chouinar Jan 5, 2026

I'd say we should keep the frontend from being aware of SGM as much as possible, since the problem is largely a data issue, having data be "merged" in the API itself makes more sense. It also means if something is only in our system or across two systems, the frontend (and user experience), doesn't really need to be aware of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

chouinar chouinar left review comments

doug-s-nava doug-s-nava left review comments

andycochran Awaiting requested review from andycochran andycochran is a code owner

btabaska Awaiting requested review from btabaska btabaska is a code owner

joshtonava Awaiting requested review from joshtonava joshtonava is a code owner

myduong-navapbc Awaiting requested review from myduong-navapbc myduong-navapbc is a code owner

widal001 Awaiting requested review from widal001 widal001 is a code owner

prasnava Awaiting requested review from prasnava prasnava is a code owner

jakobpederson Awaiting requested review from jakobpederson jakobpederson is a code owner

ErinPattisonNava Awaiting requested review from ErinPattisonNava ErinPattisonNava is a code owner

glenblosser-nava Awaiting requested review from glenblosser-nava glenblosser-nava is a code owner

nathan-stricker Awaiting requested review from nathan-stricker nathan-stricker is a code owner

hao10282025-sudo Awaiting requested review from hao10282025-sudo hao10282025-sudo is a code owner

kkrug Awaiting requested review from kkrug kkrug is a code owner

At least 1 approving review is required to merge this pull request.

Labels

None yet