Conversation
…modernize-rebuild-scripts * origin/modernize-rebuild-scripts: Format code and sort imports
|
In working on the first few scripts, |
…modernize-rebuild-scripts * origin/modernize-rebuild-scripts: Format code and sort imports
…ipts * origin/main: Format code and sort imports Reorganize rockd schema code Removed unused Rockd subsystem Break apart Rockd migrations and validate Updated Rockd migrations Basic migration is at least planned out Use submodule version of tile utils Updated ordering of migration checks returning tilejson if ANY lines polygons or points are available updating postgrest view permissions updating map ingest endpoint permissions updated sources postgrest endpoit Format code and sort imports add sources to postgrest Format code and sort imports fixing api v3
* stratigraphy-ingestion: (190 commits) Basic loading works Updated logging utils Refactor column units preparation Starting point for column ingestion Improve lithologies All tests pass Added failing tests for lithology ingestion Successfully integrate lithologies basic lithology tests pass Basic tests of lithology matching Starting points for database inserts Added basic database file Started working with metadata Start managing units table add a no-op cli Updated some typer dependencies Updated pyproject toml file Format code and sort imports Updated tileserver for paleogepgraphy layers Remove .idea project files from tracking ...
|
OK, I finished a round of updates on several of the scripts
Generally, I improved them to be much more streamlined in their approach and avoid loops for row-by-row updates in Python (which was mostly the original approach).
|
* main: Format code and sort imports Add new commits to submodules Format code and sort imports updated image ext to .jpg. need to update photo url based on parameter inputs Format code and sort imports made the convert endpoint match CheckinData from the rockd create-edit-checkin route Format code and sort imports updating convert endpoint to accept all planar orientations in an observation
7bd3ea8 to
4baf4ae
Compare
|
@amyfromandi already found an error in the new version. It's small but emphasizes that we want to proceed carefully here. #256 |
…rostrat/macrostrat into modernize-rebuild-scripts * 'modernize-rebuild-scripts' of https://github.com/UW-Macrostrat/macrostrat: Format code and sort imports
|
FYI: v2 |
|
Since there is no |
|
Using these queries to diff the |
|
I tried optimizig the |
|
I refactored the rebuild scripts into the command line. Below is an example to access the scripts: I ran all of the rebuid scripts in staging and below are the diffs. Need to review why autocomplete has a negative diff. I also wonder if the scripts that have a -1 or -2 could be due to #231 (comment).
|
We don't have to do this now, but I expect that if we pre-validate the geometry column and pre-populate empty polygons where necessary, we can add a spatial index that will improve things. There are other ways to optimize this using topological relationships as well. |
|
Also, it would be nice if there was a |
Just executing |
|
I found that the query below (from the |
There was a problem hiding this comment.
Some proposed changes to how the code is wrapped together.
Take a look at https://github.com/UW-Macrostrat/macrostrat/blob/main/py-modules/map-integration/macrostrat/map_integration/process/__init__.py for an idea of the proposed structure (this is the root of the map scripts that have already been ported over).
Delete the old rebuild scripts from the v1 directory or move to an archive if you haven't already done that.
| } | ||
|
|
||
|
|
||
| @cli.command() |
There was a problem hiding this comment.
I'd use the name="all" argument here to maintain parallelism with the old version (macrostrat rebuild all is easy to remember)
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| class Autocomplete: |
There was a problem hiding this comment.
It looks like most/all of these scripts could just be functions rather than classes, which would be simpler and allow easier integration with Typer
|
|
||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # Shared helpers (from lookup_units.py) |
There was a problem hiding this comment.
Maybe put these in a utils file? They are kind of less important overall.
| UnitBoundaries, | ||
| ) | ||
|
|
||
| return { |
There was a problem hiding this comment.
Consider making all of these scripts CLI commands in their own right, so that Typer's semantics can be used and arguments can be added easily to each.
The scripts command can be retained to run all of them in sequence.
| from rich.console import Console | ||
| from typer import Option, Typer | ||
|
|
||
| cli = Typer(help="Rebuild database tools") |
There was a problem hiding this comment.
Use no_args_is_help=True
|
In reviewingn the other scripts these are some data parity issues for the 1-2 row count variances.
|
Ran into this same issue for strat_name_footprints. Fixed and now there is a 2k variance. This variance is because there are 3054 duplicate |
…he mariadb migration. this is fixed after the rebuild script is ran
Modernize stratigraphy rebuild scripts
macrostrat v1 rebuild <step>ormacrostrat v1 rebuild allScripts
autocompletelookup_strat_nameslookup_unit_attrs_apilookup_unit_intervalslookup_unitspbdb_matchesstatsstrat_name_footprintsunit_boundariesTasks for each script
Some of these scripts are outdated, so they may not work without some modification. But we should start with direct SQL translation where possible.
Many run lots of SQL and so might benefit from being run/tested against a local database, or on the cluster.
Our new
macrostrat.databaselibrary should make the SQL a lot terser and easier to read.Overall process
macrostratschema and API results (number of rows, output structure)v1schema