Skip to content

Conversation

@mdragon
Copy link
Collaborator

@mdragon mdragon commented Dec 24, 2025

Summary

Fixes #6980

Adds an ADR describing the ways we've identified we'll access GS data directly from SGM

@mdragon mdragon force-pushed the mdragon/6980-sgm-data-flow-adr branch from 64ed70f to e27ee12 Compare December 24, 2025 21:40
…s checked by the linter so we know that one's up-to-date
@mdragon mdragon force-pushed the mdragon/6980-sgm-data-flow-adr branch from e27ee12 to 460f78f Compare December 24, 2025 21:41
@mdragon
Copy link
Collaborator Author

mdragon commented Dec 24, 2025

Not sure why the check is failing, the file it's mad about is linked in the Summary, just like all the other pages Brandon added along with it...


## Context and Problem Statement

Given the different approach for modernization of GrantSolutions we need an ongoing interoperability with the existing system, not a strangler, rebuild and replace pattern. To allow for that we need to have strategies for accessing data across the existing and modernized system that were not already designed and built for the Simpler Grants.gov work. How will we allow for bi-directional, near real-time, data flow between GrantSolutions (GS) and Simpler Grants Management (SGM).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this last sentence is a question

How will we allow for bi-directional, near real-time, data flow between GrantSolutions (GS) and Simpler Grants Management (SGM)?

## Decision Drivers

- Optimize User Experience
- User's should be able to move between systems as seamlessly as possible
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- User's should be able to move between systems as seamlessly as possible
- Users should be able to move between systems as seamlessly as possible


- Optimize User Experience
- User's should be able to move between systems as seamlessly as possible
- Workflows or processes should move from system to system as needed without long delays or user intervention
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean the workflows themselves should move, as in a workflow defined in simpler should be able to be accessed from grant solutions? Or is this more that as part of a user's "workflow" they should be able to track data between the two systems? "Workflow" being a tricky term I think this needs some clarification


### Bulk data copy on a scheduled basis

This is the approach we took on Simpler Grants.gov. Hourly, we Extract, Load, and Transform (ELT) all of the table data we need from Grants.gov's database into Simpler Grants.gov's database. Whenever possible we only pull records updated since the last run to minimize data volume and the associated load on the existing system. We do not currently send data back to Grants.gov's database but in the SGM work that would be a requirement as well. We would modify our existing processes to support bi-directional data transfer. We could also consider improving the existing code base to allow it to run more frequently without collision and add more filtering to avoid when we fetch rows that we didn't see their FK records and so we fail to create the records (we could just only process something if we've already seen the parent record this run).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are FK records here?


### Bulk data copy on a scheduled basis

This is the approach we took on Simpler Grants.gov. Hourly, we Extract, Load, and Transform (ELT) all of the table data we need from Grants.gov's database into Simpler Grants.gov's database. Whenever possible we only pull records updated since the last run to minimize data volume and the associated load on the existing system. We do not currently send data back to Grants.gov's database but in the SGM work that would be a requirement as well. We would modify our existing processes to support bi-directional data transfer. We could also consider improving the existing code base to allow it to run more frequently without collision and add more filtering to avoid when we fetch rows that we didn't see their FK records and so we fail to create the records (we could just only process something if we've already seen the parent record this run).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably leave out the detail about filtering based on the foreign key issue, while that is an annoying issue, it's probably too specific to call out here.

I will say that our current approach is heavily reliant on grants.gov's method of storing data in their DB (having created/updated timestamps), it wouldn't be possible to replicate if SGM doesn't have that working properly.

Also, bidirectional might not work well at all with our current approach, not only would writing back potentially put us in an endless loop (we write an update to legacy, which now has an updated timestamp and we pull it, and then process that update, and now we have an update to write back to legacy ... ).

Batch processing is likely something we'll want where timeliness isn't a concern, but we might need to start at least partially fresh depending on how SGM works.

Comment on lines +68 to +70
- **Cons**
- Still learning what data has existing APIs that will make this possible

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a call fails? A con of using APIs might be that we end up in a weird state because an API call failed, but we don't have a way pipe that through? There's also some level of complexity about how to identify that two records across systems representing the same thing, opportunities have 3 different IDs (legacy integers, our UUIDs, and opportunity number which isn't unique, but often what other systems use). Even if we have IDs in sync, those IDs might not even exist to connect records (a legacy integer opportunity ID can't be created anywhere BUT grants.gov adding an entire dependency there).


### Call GS APIs directly as needed, store additional data points in SGM (without duplicating existing data)

We won't be able to always mutate the GS data model as quickly as we'd want to iterate on SGM. In those cases we would store new fields in the Simpler DB, with the identifier of the record in GS. This would allow the data from both systems to be pulled together either in the API layer or in the FE via 2 API calls depending on whether we're wrapping API calls to GS in the Simpler API.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say we should keep the frontend from being aware of SGM as much as possible, since the problem is largely a data issue, having data be "merged" in the API itself makes more sense. It also means if something is only in our system or across two systems, the frontend (and user experience), doesn't really need to be aware of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ADR for managing data flow between SGM and GS

4 participants