Skip to content

Conversation

@quinnylee
Copy link
Contributor

@quinnylee quinnylee commented Jun 4, 2025

See issue #54

Attempting to add functionality for CLI subsetting
This does not work right now :(

  • added --location as a CLI flag, and made it required for users who don't input a gage or latlon (aka use a catid). Turns out catids are duplicated across different hydrofabrics (I found the same catid number in Hawaii and in Maine). For this same reason I changed the naming convention for output folders identified by catids to be prefixed by conus- or hi-, on the off chance that someone does research in both areas
  • un-hardcoded references to the CONUS hydrofabric/pickle graph
  • added a new coordinate transform to ESRI:102007 (Hawaii Albers)
  • Did not change anything in source_validation.py since Hawaii (and Alaska, Puerto Rico) hydrofabrics aren't in the community hydrofabric s3 bucket. I got around this by manually putting my download of hi_nextgen.gpkg into .ngiab/hydrofabric/v2.2

@quinnylee quinnylee changed the title support for hawaii support for hawaii (WIP) Jun 4, 2025
@quinnylee
Copy link
Contributor Author

Fixed CLI subsetting: turns out the CONUS hydrofabric and the HI hydrofabric are slightly different, so I made a new hawaii-specific template.sql and triggers.sql. Also the CRS is different

Added forcing generation for HI to CLI: added the appropriate NWM 3.0 S3 URL. Fun fact: Hawaii forcing data is only available from 01/01/1994-12/31/2013, a little quirk that is never mentioned in the S3 documentation. Also, AORC does not cover Hawaii, so including the flag --source aorc when trying to find forcings for a Hawaii location raises an error.

@JoshCu
Copy link
Member

JoshCu commented Jun 4, 2025

I had totally missed that the retrospective covers hawaii, PR and alaska too, https://noaa-nwm-retrospective-3-0-pds.s3.amazonaws.com/index.html#PR/zarr/forcing.zarr/

I'd been messing about trying to get era5 data to work.

I think I'm going to and modify the hydrofabrics to make them uniform too which should make this code easier.

@quinnylee
Copy link
Contributor Author

quinnylee commented Jun 4, 2025

That would be pretty cool to only have one template and trigger file across all regions...

Everything is in a different crs, so this latest commit transforms our hawaii gpkgs from the Hawaii Albers Conic (ESRI 102007) to a more widely accepted crs (EPSG 5070, Albers Conic), which is required by nextgen

So now the --run flag in the CLI technically works (or doesn't raise errors), except in a few weird cases where it can't seem to find the right polygon for a gage, like gage-16717000 -- no clue what is wrong with this one.

@quinnylee

This comment was marked as resolved.

@JoshCu
Copy link
Member

JoshCu commented Jun 4, 2025

Turns out my subsetting is all broken, see this odd incident image

I'd guess some of the assumptions I make in the conus hydrofabric about gage to catchment associations don't translate

@quinnylee

This comment was marked as duplicate.

@quinnylee
Copy link
Contributor Author

quinnylee commented Jun 5, 2025

I made a silly mistake that caused the subsetting function to look at the conus hydrofabric instead of hawaii. These look a lot better now:
Screenshot 2025-06-05 110443
Screenshot 2025-06-05 110616
Screenshot 2025-06-05 110750

@quinnylee
Copy link
Contributor Author

quinnylee commented Jun 9, 2025

pmtiles here https://alabama.box.com/s/t27756wfdlyfoyd9jc239yc2lmwk6fsm

As far as I know, TEEHR doesn't support anything outside of CONUS -- their USGS to NWM3.0 crosswalk appears to only be for CONUS. Other than that, everything else in the NGIAB workflow seems to function properly.

I did the thing again where I hardcoded references to the PMTiles and the light/dark style jsons, which will have to change before a merge. Those references are in

  • lines 12-14 of index.html
  • lines 53 and 56 of main.js
  • line 11 of light-style.json
  • line 11 of dark-style.json

@quinnylee quinnylee marked this pull request as ready for review June 9, 2025 15:56
@quinnylee quinnylee changed the title support for hawaii (WIP) support for hawaii Jun 17, 2025
Copy link
Member

@JoshCu JoshCu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good work! if you could update those debug statements to use the printf syntax, Ill take a look at getting ngen to accept the hawaii coordinate system. I'll see if I can fix the geopackage to match conus too.

for reference about those debug messages https://coralogix.com/blog/python-logging-best-practices-tips/
usually it's an insignificant difference but in some of those graph functions it might make a difference?

Do not construct log messages eagerly
Clients can construct log messages in two ways: eagerly and lazily.

The client constructs the log message and passes it on to the logging method, e.g., logger.debug(f'Entering method Foo: {x=}, {y=}').
This approach offers formatting flexibility via f-strings and the format() method, but it involves the eager construction of log messages, i.e., before the logging statements are deemed as active.
The client provides a printf-style message format string (as a msg argument) and the values (as a args argument) to construct the log message to the logging method, e.g., logger.debug('Entering method %s: x=%d, y=%f', 'Foo', x, y). After the logging statement is deemed as active, the logger constructs the log message using the string formatting operator %.
This approach relies on an older and quirky string formatting feature of Python but it involves the lazy construction of log messages.
While both approaches result in the same outcome, they exhibit different performance characteristics due to the eagerness and laziness of message construction.

For example, on a typical laptop, a million inactive invocations of logger.debug('Test message {0}'.format(t)) takes 2197ms while a million inactive invocations of logger.debug('Test message %s', t) takes 1111ms when t is a list of four integers. In the case of a million active invocations, the first approach takes 11061ms and the second approach took 10149ms. A savings of 9–50% of the time taken for logging!

So, the second (lazy) approach is more performant than the first (eager) approach in cases of both inactive and active logging statements. Further, the gains would be larger when the message construction is non-trivial, e.g., use of many arguments, conversion of complex arguments to strings.

logger.debug(f"old geometry: {shapely_geometry}")
return new_geometry

def convert_gpkg_to_5070(gpkg):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might be able to fix this by updating https://github.com/CIROH-UA/ngen/blob/ngiab/include/geopackage/proj.hpp
but as long as this function is called after we calculate the forcings then it shouldn't matter how much we mangle the geometry, it's not actually used anywhere in ngen or t-route

)

if location == "hi":
convert_gpkg_to_5070(paths.geopackage_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a problem if the forcings are reprocessed without subsetting again. I'll see if I can update ngen to accept the hawaii coordinates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got you boss
CIROH-UA/ngen#13

@JoshCu
Copy link
Member

JoshCu commented Jun 18, 2025

don't worry about those conflicts for now, I can resolve them when it gets merged

quinnylee and others added 4 commits June 18, 2025 13:41
remove references to conus_q and hawaii_q

Co-authored-by: Josh Cunningham <[email protected]>
@quinnylee
Copy link
Contributor Author

just for clarity, printf syntax refers to something like this?
logging.error("Python version: %s", sys.version)

@JoshCu
Copy link
Member

JoshCu commented Oct 10, 2025

latest oconus updates https://github.com/CIROH-UA/NGIAB_data_preprocess/tree/superconus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants