-
Notifications
You must be signed in to change notification settings - Fork 52
adding support for verbatim extension downloads #831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds support for downloading DWCA verbatim extensions in GBIF occurrence data downloads. The feature allows users to specify additional extension tables to be included in DWCA format downloads.
Changes:
- Added
verbatim_extensionsparameter toocc_download()andocc_download_prep()functions - Updated documentation in vignette explaining the new feature with examples
- Added test coverage for the new functionality
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| R/occ_download.R | Added verbatim_extensions parameter to download functions and updated parameter documentation |
| R/download_predicate_dsl.R | Modified parse_predicates() to include verbatimExtensions in API payload with format validation |
| man/occ_download.Rd | Added documentation for the new verbatim_extensions parameter |
| vignettes/getting_occurrence_data.Rmd | Added section explaining how to use verbatim DWCA extensions with examples |
| tests/testthat/test-occ_download.R | Added test case for downloads with verbatim extensions |
| tests/testthat/test-download_parsing.R | Added test for verbatim_extensions parsing and updated existing tests to include the new parameter |
| tests/fixtures/occ_download_9.yml | Added fixture file for verbatim extensions test case |
| DESCRIPTION | Version bump from 3.8.4.3 to 3.8.4.4 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| 'SIMPLE_CSV', or 'SPECIES_LIST'} | ||
| \item{verbatim_extensions}{(character vector) A character vector of verbatim | ||
| extensions to include in the download.} |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation for the verbatim_extensions parameter should mention that it only works with format="DWCA" and will be ignored for other formats. This helps users understand the parameter's constraints upfront.
| extensions to include in the download.} | |
| extensions to include in the download. Only applicable when \code{format = "DWCA"}; | |
| ignored for other formats.} |
| test_that("parse_predicates verbatim_extensions works", { | ||
| ve <- c("http://rs.tdwg.org/dwc/terms/MeasurementOrFact", | ||
| "http://rs.gbif.org/terms/1.0/Multimedia") | ||
|
|
||
| aa <- parse_predicates("john", "email", "and", "DWCA", ve, pred("taxonKey", 22)) | ||
|
|
||
| expect_is(aa, "list") | ||
| expect_named(aa, c("creator", "notification_address", "format", "verbatimExtensions", "predicate")) | ||
| expect_equal(unclass(aa$verbatimExtensions), ve) | ||
| }) |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing test coverage for the warning that should be triggered when verbatim_extensions is provided with a format other than "DWCA". Add a test case that verifies the warning message is issued when using verbatim_extensions with format="SIMPLE_CSV" or format="SPECIES_LIST".
| #' # unique(df$basisOfRecord) | ||
| #' # unique(df$taxonKey) | ||
| #' # unique(df$countryCode) | ||
| #' # sort(unique(df$year)) |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding an example demonstrating the use of verbatim_extensions parameter in the examples section. This would help users understand how to use this new feature.
| #' # sort(unique(df$year)) | |
| #' # sort(unique(df$year)) | |
| #' | |
| #' ## example using verbatim_extensions | |
| #' # x_verbatim <- occ_download( | |
| #' # pred("taxonKey", 9206251), | |
| #' # format = "SIMPLE_CSV", | |
| #' # verbatim_extensions = c("gbif:Multimedia") | |
| #' # ) | |
| #' # occ_download_meta(x_verbatim) |
| bbb <- occ_download( | ||
| pred("taxonKey",22), | ||
| format = "DWCA", | ||
| verbatim_extensions= |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a space after the equals sign on line 192 to match the formatting style used elsewhere in the codebase (e.g., lines 91, 111, 131, 151, 171 all have 'format = "..."').
| verbatim_extensions= | |
| verbatim_extensions = |
| if (!is.null(verbatim_extensions)) { | ||
| if(format != "DWCA") { | ||
| warning("verbatim_extensions can only be used with format = 'DWCA'. Ignoring verbatim_extensions.") | ||
| } | ||
| payload <- list( | ||
| creator = unbox(user), | ||
| notification_address = email, | ||
| format = unbox(format), | ||
| verbatimExtensions = verbatim_extensions, | ||
| predicate = list() | ||
| ) | ||
| } else { | ||
| payload <- list( | ||
| creator = unbox(user), | ||
| notification_address = email, | ||
| format = unbox(format), | ||
| predicate = list() | ||
| ) |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning says verbatim_extensions will be ignored when format is not "DWCA", but the code still includes verbatimExtensions in the payload regardless of the format. The conditional logic should only add verbatimExtensions to the payload when format equals "DWCA". Consider restructuring this to add verbatimExtensions to the payload only when both conditions are met: verbatim_extensions is not NULL AND format is "DWCA".
| if (!is.null(verbatim_extensions)) { | |
| if(format != "DWCA") { | |
| warning("verbatim_extensions can only be used with format = 'DWCA'. Ignoring verbatim_extensions.") | |
| } | |
| payload <- list( | |
| creator = unbox(user), | |
| notification_address = email, | |
| format = unbox(format), | |
| verbatimExtensions = verbatim_extensions, | |
| predicate = list() | |
| ) | |
| } else { | |
| payload <- list( | |
| creator = unbox(user), | |
| notification_address = email, | |
| format = unbox(format), | |
| predicate = list() | |
| ) | |
| payload <- list( | |
| creator = unbox(user), | |
| notification_address = email, | |
| format = unbox(format), | |
| predicate = list() | |
| ) | |
| if (!is.null(verbatim_extensions)) { | |
| if (format != "DWCA") { | |
| warning("verbatim_extensions can only be used with format = 'DWCA'. Ignoring verbatim_extensions.") | |
| } else { | |
| payload$verbatimExtensions <- verbatim_extensions | |
| } |
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
#829