Skip to content

Conversation

brontolosone
Copy link
Contributor

Closes getodk/central#1373

What has been done to verify that this works as intended?

Manual testing.

Why is this the best possible solution? Were any other approaches considered?

Batching is used because it gives the user some sort of progress meter, which is nice if there's a lot to process and/or their DB is slow. They'll continue to receive visual feedback that there's nothing stuck.

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

N/A

Does this change require updates to the API documentation? If so, please update docs/api.yaml as part of this PR.

N/A

Before submitting this PR, please make sure you have:

  • run make test and confirmed all checks still pass OR confirm CircleCI build passes
  • verified that any code from external sources are properly credited in comments or that everything is internally sourced

@brontolosone brontolosone force-pushed the 1373_backfill-submission-geo-cache branch from 8cdb999 to d17210a Compare September 28, 2025 09:56
@brontolosone brontolosone force-pushed the 1373_backfill-submission-geo-cache branch from d17210a to 4091c60 Compare September 28, 2025 10:03
@brontolosone brontolosone force-pushed the 1373_backfill-submission-geo-cache branch from 4091c60 to 2790e4d Compare September 28, 2025 14:00
@brontolosone brontolosone marked this pull request as ready for review September 28, 2025 14:05
const path = require('path');


function getSqlFiles(upOrDown) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think of moving this function into a common file? it is duplicate of what's in 20250927-01-geoextracts.js



const up = async (db) => {
const BATCH_SIZE = 1000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to do batching during migration/upgrade-time? Server would be down in any case so I don't see any problem populating the cache in one go.

We can keep the changes in cache_all_submission_geo to support batching that would be handy down the line when we might want to run the function at run-time

DROP FUNCTION IF EXISTS "public"."cache_all_submission_geo"() CASCADE;

--- create: cache_all_submission_geo(only_default_path boolean, batchsize integer) ---
CREATE FUNCTION "public"."cache_all_submission_geo"(only_default_path boolean = false, batchsize integer = 9223372036854775807)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we set the default value of batchsize to NULL? I am thinking about cases where selected rows are greater than 9223372036854775807. From the docs:

LIMIT ALL is the same as omitting the LIMIT clause, as is LIMIT with a NULL argument.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Backfill submission geo cache

2 participants