You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Issues/91 (#92)
* added citation creation tests and functionality to subscriber and downloader
* added verbose option to create_citation_file command, previously hard coded
* updated changelog (whoops) and fixed regression test:
1. Issue where the citation file now downloaded affected the counts
2. Issue where the logic for determining if a file modified time was changing or not was picking up the new citation file which _always_ gets rewritten to update the 'last accessed' date.
* updated request to include exec_info in warning; fixed issue with params not being a dictionary caused errors
* changed a warning to debug for citation file. fixed test issues
* Enable debug logging during regression tests and set max parallel workflows to 2
* added output to pytest
* fixed test to only look for downlaoded data files not citation file due to 'random' cmr errors when creating a citation.
* added mock testing and retry on 503
* added 503 fixes
Co-authored-by: Frank Greguska <[email protected]>
* fixed issues where token was not proagated to CMR queries (#95)
* Misc fixes (#101)
* added ".tiff" to default extensions to address #100
* removed 'warning' message on not downloading all data to close#99
* updated help documentation for start/end times to close#79
* added version update, updates to CHANGELOG
* added token get,delete, refresh and list operations
* Revert "added token get,delete, refresh and list operations"
This reverts commit 15aba90.
* Update python-app.yml
* updated poetry version
Version matches build/test versions.
* Issues/98 (#107)
* added token get,delete, refresh and list operations
* Revert "added token get,delete, refresh and list operations"
This reverts commit 15aba90.
* added EDL (not cmr-token) based get, list,delete, refresh token
* updated token regression tests
* updates and tests for subscriber moving to EDL.
* marked tests as regression test
* Update subscriber/podaac_data_downloader.py
Co-authored-by: Frank Greguska <[email protected]>
* Update subscriber/podaac_data_subscriber.py
Co-authored-by: Frank Greguska <[email protected]>
* Update subscriber/podaac_access.py
Co-authored-by: Frank Greguska <[email protected]>
* Update subscriber/podaac_access.py
Co-authored-by: Frank Greguska <[email protected]>
* Update subscriber/podaac_access.py
Co-authored-by: Frank Greguska <[email protected]>
* added exec info to errors, cleaned up some log statements
Co-authored-by: Frank Greguska <[email protected]>
* Issues/109 (#111)
* Develop (#103)
* Issues/91 (#92)
* added citation creation tests and functionality to subscriber and downloader
* added verbose option to create_citation_file command, previously hard coded
* updated changelog (whoops) and fixed regression test:
1. Issue where the citation file now downloaded affected the counts
2. Issue where the logic for determining if a file modified time was changing or not was picking up the new citation file which _always_ gets rewritten to update the 'last accessed' date.
* updated request to include exec_info in warning; fixed issue with params not being a dictionary caused errors
* changed a warning to debug for citation file. fixed test issues
* Enable debug logging during regression tests and set max parallel workflows to 2
* added output to pytest
* fixed test to only look for downlaoded data files not citation file due to 'random' cmr errors when creating a citation.
* added mock testing and retry on 503
* added 503 fixes
Co-authored-by: Frank Greguska <[email protected]>
* fixed issues where token was not proagated to CMR queries (#95)
* Misc fixes (#101)
* added ".tiff" to default extensions to address #100
* removed 'warning' message on not downloading all data to close#99
* updated help documentation for start/end times to close#79
* added version update, updates to CHANGELOG
* added token get,delete, refresh and list operations
* Revert "added token get,delete, refresh and list operations"
This reverts commit 15aba90.
* Update python-app.yml
Co-authored-by: Frank Greguska <[email protected]>
* updated poetry version
Version matches build/test versions.
* Update README.md
* Update podaac_data_downloader.py
Fixing for issues 109 - adding capability to download by granule-name
* Update Downloader.md
Fixed the help file
* added changelog entries, regressiont ests
* added poetry lock cleanup
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: sureshshsv <[email protected]>
Co-authored-by: sureshshsv <[email protected]>
* added README information and updates (#113)
* fixed pymock issues... again
* Extension regex (#121)
* extend -e option to handle regular expressions (#115)
* Develop into Main (1.12.0) (#114)
* Issues/91 (#92)
* added citation creation tests and functionality to subscriber and downloader
* added verbose option to create_citation_file command, previously hard coded
* updated changelog (whoops) and fixed regression test:
1. Issue where the citation file now downloaded affected the counts
2. Issue where the logic for determining if a file modified time was changing or not was picking up the new citation file which _always_ gets rewritten to update the 'last accessed' date.
* updated request to include exec_info in warning; fixed issue with params not being a dictionary caused errors
* changed a warning to debug for citation file. fixed test issues
* Enable debug logging during regression tests and set max parallel workflows to 2
* added output to pytest
* fixed test to only look for downlaoded data files not citation file due to 'random' cmr errors when creating a citation.
* added mock testing and retry on 503
* added 503 fixes
Co-authored-by: Frank Greguska <[email protected]>
* fixed issues where token was not proagated to CMR queries (#95)
* Misc fixes (#101)
* added ".tiff" to default extensions to address #100
* removed 'warning' message on not downloading all data to close#99
* updated help documentation for start/end times to close#79
* added version update, updates to CHANGELOG
* added token get,delete, refresh and list operations
* Revert "added token get,delete, refresh and list operations"
This reverts commit 15aba90.
* Update python-app.yml
* updated poetry version
Version matches build/test versions.
* Issues/98 (#107)
* added token get,delete, refresh and list operations
* Revert "added token get,delete, refresh and list operations"
This reverts commit 15aba90.
* added EDL (not cmr-token) based get, list,delete, refresh token
* updated token regression tests
* updates and tests for subscriber moving to EDL.
* marked tests as regression test
* Update subscriber/podaac_data_downloader.py
Co-authored-by: Frank Greguska <[email protected]>
* Update subscriber/podaac_data_subscriber.py
Co-authored-by: Frank Greguska <[email protected]>
* Update subscriber/podaac_access.py
Co-authored-by: Frank Greguska <[email protected]>
* Update subscriber/podaac_access.py
Co-authored-by: Frank Greguska <[email protected]>
* Update subscriber/podaac_access.py
Co-authored-by: Frank Greguska <[email protected]>
* added exec info to errors, cleaned up some log statements
Co-authored-by: Frank Greguska <[email protected]>
* Issues/109 (#111)
* Develop (#103)
* Issues/91 (#92)
* added citation creation tests and functionality to subscriber and downloader
* added verbose option to create_citation_file command, previously hard coded
* updated changelog (whoops) and fixed regression test:
1. Issue where the citation file now downloaded affected the counts
2. Issue where the logic for determining if a file modified time was changing or not was picking up the new citation file which _always_ gets rewritten to update the 'last accessed' date.
* updated request to include exec_info in warning; fixed issue with params not being a dictionary caused errors
* changed a warning to debug for citation file. fixed test issues
* Enable debug logging during regression tests and set max parallel workflows to 2
* added output to pytest
* fixed test to only look for downlaoded data files not citation file due to 'random' cmr errors when creating a citation.
* added mock testing and retry on 503
* added 503 fixes
Co-authored-by: Frank Greguska <[email protected]>
* fixed issues where token was not proagated to CMR queries (#95)
* Misc fixes (#101)
* added ".tiff" to default extensions to address #100
* removed 'warning' message on not downloading all data to close#99
* updated help documentation for start/end times to close#79
* added version update, updates to CHANGELOG
* added token get,delete, refresh and list operations
* Revert "added token get,delete, refresh and list operations"
This reverts commit 15aba90.
* Update python-app.yml
Co-authored-by: Frank Greguska <[email protected]>
* updated poetry version
Version matches build/test versions.
* Update README.md
* Update podaac_data_downloader.py
Fixing for issues 109 - adding capability to download by granule-name
* Update Downloader.md
Fixed the help file
* added changelog entries, regressiont ests
* added poetry lock cleanup
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: sureshshsv <[email protected]>
Co-authored-by: sureshshsv <[email protected]>
* added README information and updates (#113)
* fixed pymock issues... again
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: sureshshsv <[email protected]>
Co-authored-by: sureshshsv <[email protected]>
* extend -e option to handle regular expressions
formerly, -e could not handle PTM_\d+ extensions without the user explicitly
calling all of them.
---------
Co-authored-by: mike-gangl <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: sureshshsv <[email protected]>
Co-authored-by: sureshshsv <[email protected]>
* added dcoumentation and tests for regex
* converted defaults to regexes, added gtiff test
---------
Co-authored-by: Peter Mao <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: sureshshsv <[email protected]>
Co-authored-by: sureshshsv <[email protected]>
* closes 118. retries was never hit because range is not end inclusive. (#119)
* closes 118. retries was never hit ebcause range is not end inclusive.
* updated test to catch now-thrown exception
* added --dry-run option, docs, and test cases (#124)
* added --dry-run option, docs, and test cases
* Update subscriber/podaac_data_downloader.py
Added more elegant way of download limit application
Co-authored-by: Stepheny Perez <[email protected]>
---------
Co-authored-by: Stepheny Perez <[email protected]>
* Issues/70 (#117)
* added code for updating version
* added chagnelog
* moved version check into __main__ instead of on import of the module
* added sorting of releases from github to find latest release.
* added authenticated (option) access to github API to rpevent rate limiting
* separate out auth/token regression tests
* Issues/127 (#128)
* added token sensitivity filter to remove tokens from CMR queries
* added changelog updates
* updated some lingering merge issues (huh?)
* updated regression test
* updated ubuntu versions
* removed 18.04 ubuntu from workflows/actions
* version and documentation updates (#130)
---------
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: sureshshsv <[email protected]>
Co-authored-by: sureshshsv <[email protected]>
Co-authored-by: Peter Mao <[email protected]>
Co-authored-by: Stepheny Perez <[email protected]>
Copy file name to clipboardexpand all lines: CHANGELOG.md
+8
Original file line number
Diff line number
Diff line change
@@ -3,6 +3,14 @@ All notable changes to this project will be documented in this file.
3
3
4
4
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
5
5
6
+
## 1.13.0
7
+
### Added
8
+
- Added --dry-run option to subscriber and downloader to view the files that _would_ be downloaded without actuall downloading them. [102](https://github.com/podaac/data-subscriber/issues/102)
9
+
- Added new feature allowing regex to be used in `--extension``-e` options. For example using -e `PTM_\\d+` would match data files like `filename.PTM_1`, `filename.PTM_2` and `filename.PTM_10`, instead of specifying all possible combinations (`-e PTM_1, -e PTM_2, ..., -e PMT_10`) [115](https://github.com/podaac/data-subscriber/issues/115)
10
+
- Added check for updated version [70](https://github.com/podaac/data-subscriber/issues/70)
11
+
- Removed CMR Token from log messages [127](https://github.com/podaac/data-subscriber/issues/127)
12
+
13
+
6
14
## 1.12.0
7
15
### Fixed
8
16
- Added EDL based token downloading, removing CMR tokens [98](https://github.com/podaac/data-subscriber/issues/98),
Cycle number for determining downloads. can be repeated for multiple cycles
19
19
-sd STARTDATE, --start-date STARTDATE
20
-
The ISO date time before which data should be retrieved. For Example, --start-date 2021-01-14T00:00:00Z
20
+
The ISO date time after which data should be retrieved. For Example, --start-date 2021-01-14T00:00:00Z
21
21
-ed ENDDATE, --end-date ENDDATE
22
-
The ISO date time after which data should be retrieved. For Example, --end-date 2021-01-14T00:00:00Z
23
-
-f, --force
24
-
Flag to force downloading files that are listed in CMR query, even if the file exists and checksum matches
22
+
The ISO date time before which data should be retrieved. For Example, --end-date 2021-01-14T00:00:00Z
23
+
-f, --force Flag to force downloading files that are listed in CMR query, even if the file exists and checksum matches
25
24
-b BBOX, --bounds BBOX
26
-
The bounding rectangle to filter result in. Format is W Longitude,S Latitude,E Longitude,N Latitude without
27
-
spaces. Due to an issue with parsing arguments, to use this command, please use the -b="-180,-90,180,90" syntax
28
-
when calling from the command line. Default: "-180,-90,180,90".
25
+
The bounding rectangle to filter result in. Format is W Longitude,S Latitude,E Longitude,N Latitude without spaces. Due to an issue with parsing arguments, to use this command, please use the -b="-180,-90,180,90" syntax when calling from the command line.
26
+
Default: "-180,-90,180,90".
29
27
-dc Flag to use cycle number for directory where data products will be downloaded.
30
28
-dydoy Flag to use start time (Year/DOY) of downloaded data for directory where data products will be downloaded.
31
-
-dymd Flag to use start time (Year/Month/Day) of downloaded data for directory where data products will be
32
-
downloaded.
29
+
-dymd Flag to use start time (Year/Month/Day) of downloaded data for directory where data products will be downloaded.
33
30
-dy Flag to use start time (Year) of downloaded data for directory where data products will be downloaded.
34
31
--offset OFFSET Flag used to shift timestamp. Units are in hours, e.g. 10 or -10.
35
32
-e EXTENSIONS, --extensions EXTENSIONS
36
-
The extensions of products to download. Default is [.nc, .h5, .zip, .tar.gz]
37
-
-gr GRANULE, --granule-name GRANULE
38
-
The name of the granule to download. Only one granule name can be specified. Script will download all files matching similar granule name sans extension.
33
+
Regexps of extensions of products to download. Default is [.nc, .h5, .zip, .tar.gz, .tiff]
34
+
-gr GRANULENAME, --granule-name GRANULENAME
35
+
Flag to download specific granule from a collection. This parameter can only be used if you know the granule name. Only one granule name can be supplied
39
36
--process PROCESS_CMD
40
37
Processing command to run on each downloaded file (e.g., compression). Can be specified multiple times.
41
38
--version Display script version information and exit.
42
39
--verbose Verbose mode.
43
40
-p PROVIDER, --provider PROVIDER
44
41
Specify a provider for collection search. Default is POCLOUD.
45
-
--limit LIMIT Integer limit for number of granules to download. Useful in testing. Defaults to 2000
42
+
--limit LIMIT Integer limit for number of granules to download. Useful in testing. Defaults to no limit.
43
+
--dry-run Search and identify files to download, but do not actually download them
Some collections have many files. To download a specific set of files, you can set the extensions on which downloads are filtered. By default, ".nc", ".h5", and ".zip" files are downloaded by default.
213
+
Some collections have many files. To download a specific set of files, you can set the extensions on which downloads are filtered. By default, ".nc", ".h5", and ".zip" files are downloaded by default. The `-e` option is a regular expression check so you can do advanced things like `-e PTM_\\d+` to match `PTM_` followed by one or more digits- useful when the ending of a file has no suffix and has a number (1-12 for PTM, in this example)
219
214
220
215
```
221
216
-e EXTENSIONS, --extensions EXTENSIONS
222
-
The extensions of products to download. Default is [.nc, .h5, .zip]
217
+
Regexps of extensions of products to download. Default is [.nc, .h5, .zip, .tar.gz, .tiff]
223
218
```
224
219
225
220
An example of the -e usage- note the -e option is additive:
Using the `--process` option, you can run a simple command agaisnt the "just" downloaded file. This will take the format of "<command> <path/to/file>". This means you can run a command like `--process gzip` to gzip all downloaded files. We do not support more advanced processes at this time (piping, running a process on a directory, etc).
And then run the script. This should give you more verbose output on URL requests to CMR, tokens, etc.
116
116
117
+
### OTHER OPTIONS
118
+
119
+
The podaac downloader and subscriber make calls to github for checking recent releases. Unauthenticated requests are limited to 60 per hour. If you start seeing errors like:
120
+
```
121
+
releases_json = {'documentation_url': 'https://docs.github.com/rest/overview/resources-in-the-rest-api#rate-limiting', 'message': "API... here's the good news: Authenticated requests get a higher rate limit. Check out the documentation for more details.)"}
122
+
```
123
+
You'll want to set the environment variable GITHUB_TOKEN to a github personal access token- this allows for up to 5000 calls per hour. This requires a free github account. Most users will not run in to this issue.
124
+
117
125
118
126
### In need of Help?
119
127
The PO.DAAC User Services Office is the primary point of contact for answering your questions concerning data and information held by the PO.DAAC. User Services staff members are knowledgeable about both the data ordering system and the data products themselves. We answer questions about data, route requests to other DAACs, and direct questions we cannot answer to the appropriate information source.
0 commit comments