-
Notifications
You must be signed in to change notification settings - Fork 2
Aws iw pipeline uplift #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good to me! Thanks for working on this! I've put a few questions/comments in, but for the most part I think this is a great first go at adding capability to produce stac metadata.
One question I have is whether you think any of the functionality you've added needs tests, and whether you want to write these as part of this PR, or come back to them?
| config_path : Path, | ||
| make_folders : bool = True, | ||
| ): | ||
| """Download the required data for the RTC/opera and create a configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click tends toward doc strings that match Unix command help utils, so we should probably try and follow that.
Specifically, this involves a slightly different formatting of help information for click arguments: https://click.palletsprojects.com/en/stable/documentation/#documenting-arguments
| raise FileNotFoundError(f"Metadata file not found: {self.file_path}") | ||
|
|
||
| # add product stac extension properties | ||
| self.item.properties['product:type'] = 'NRB' # or RTC ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we flag this somewhere as something we'll need to review in future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just add TODO in front?
|
|
||
| # remove polarizations we don't have from the required products | ||
| # e.g. don't try add HH if it did not exist in original source data | ||
| IGNORE_ASSETS = [f'_{p}.tif' for p in ['HH','HV','VV','VH'] if p not in self.item.properties['sar:polarizations']] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to simplify this logic to focus directly on only including files that are mentioned in sar:polarizations -- or will that miss something? There's a bit of double-negative going on here that might be easier to read if it's explicit inclusion rather than explicit exclusion. Not a big deal though, code is readable as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Caitlin, agree it could be nicer. My thinking is that it was nice to have all of the files explicitly listed in the REQUIRED_ASSET_FILETYPES, then remove those that don't exist for a given product. I can also implement an append to this list that happens here instead. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I misunderstood! I thought you could do something like INCLUDED_ASSETS = [f'_{p}.tif' for p in self.item.properties['sar:polarizations']], but missed that there are files other than polarisation that you're looking to include.
With your comment on the append, is the suggestion to remove the polarisations from the REQUIRED_ASSET_FILETYPES, and just append the identified polarisation files using something similar to my one line above? I think that would work, but I also now understand your logic better, so would be totally fine to leave it as is so the REQUIRED_ASSET_FILETYPES is complete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you were understanding my suggestion right. Cool, I think I will leave it as is so the REQUIRED_ASSET_FILETYPES is complete.
|
Thanks @caitlinadams. Re tests they are certainly needed for a lot of this. My thinking was they could come at a later date once things like STAC logic have been finalised. This was largely to get an updated version for Aman to test. My next PRs will focus on tests and better logging |
Yep, totally reasonable! In that case, I think the full docs update of the CLI could also wait (I've had this on my to-do list for other CLI functions for a while anyway!) For the others, they're mostly about readability/user experience, so happy for those changes to come in a future version if you think it makes sense. |
|
Sounds good. I think I might update the CLI docs now anyway while I incorporate the other suggestions |
|
@caitlinadams I think I have covered all the changes including the improved cli docs |
Major uplift to the aws pipeline. Key changes include:
collectionas a parameter that can be specified at the top level of the workflow