Latest daily #830

ldecicco-USGS · 2025-11-17T17:30:21Z

No description provided.

Field measurements

Empties

Merge branch 'main' of github.com:DOI-USGS/dataRetrieval into develop # Conflicts: # tutorials/images/publish.png

… develop

Develop

Merge branch 'main' of github.com:DOI-USGS/dataRetrieval # Conflicts: # tutorials/basic_slides_deck.qmd # tutorials/changes_slides_deck.qmd

Fixing docs

Update docs

Update dependency requirement: `curl` >=7.0.0

ldecicco-USGS · 2025-11-18T15:00:35Z

@ehinman or @jzemmels - if either of you could take a quick look - this PR adds the latest-daily service. Additionally, an update from roxygen causes all the Rd files to be changed, but it's pretty subtle. A few other minor updates are in this that actually came from pre-shutdown.

ehinman · 2025-11-18T19:36:05Z

I can take a look this afternoon into tomorrow.

ehinman · 2025-11-18T22:29:45Z

R/read_waterdata_latest_daily.R

+                                 bbox = NA,
+                                 limit = NA,
+                                 max_results = NA,
+                                 convertType = TRUE){


Is there a reason for the order of the input parameters in the documentation and the function itself? The orders are different.

The order in the documentation doesn't matter. The order in the function matters in that a user doesn't necessarily need to name each argument - but if they don't then they need to input them in the correct order. I'll double check the order is similar to the other functions, because what we don't want to do is change it later.

ehinman · 2025-11-18T22:34:57Z

R/walk_pages.R

-    return_list <- data.frame(matrix(nrow = 0, ncol = length(properties)))
+    return_list <- data.frame(matrix(nrow = 0, 
+                                     ncol = length(properties)))
+    return_list <- lapply(return_list, as.character)


Is this needed? Ensures that each empty column's format is character?

yeah, otherwise it comes back as logical.

ehinman · 2025-11-18T22:43:20Z

tests/testthat/tests_general.R

  # no service specified:
  availableData <- read_waterdata_ts_meta(monitoring_location_id = "USGS-05114000")
-  expect_equal(ncol(availableData), 18)
+  expect_gte(ncol(availableData), 18)


Well that's snazzy.

ehinman · 2025-11-18T22:44:19Z

tutorials/basic_slides_deck.qmd

 ```

-That's a LOT of columns that come back. We won't look at them here, but let's jump over to RStudio to look through the results.
+That's a LOT of columns that come back. We won't look at them here, but you can use `View` in RStudio to explore on your own.


Suggested change

That's a LOT of columns that come back. We won't look at them here, but you can use `View` in RStudio to explore on your own.

That's a LOT of columns that came back. We won't look at them here, but you can use `View` in RStudio to explore on your own.

Hmmm now re-reading I'm not sure which would be better. Could say "That's a LOT of columns returned."

ehinman · 2025-11-18T22:48:54Z

tutorials/changes_slides_deck.qmd

+pal <- colorNumeric("viridis", latest_dane_county_daily$value)
+
+leaflet(data = latest_dane_county_daily |> 
+                sf::st_transform(crs = leaflet_crs)) |> 


Suggest individually defining leaflet_crs in case someone is skipping to a particular section and doesn't know where it is originally defined.

ehinman

First pass looks good, just want to test out the new function briefly tomorrow.

ehinman · 2025-11-19T16:45:00Z

R/read_waterdata_latest_daily.R

+#' @param monitoring_location_id `r get_params("latest-daily")$monitoring_location_id`
+#' @param parameter_code `r get_params("latest-daily")$parameter_code`
+#' @param statistic_id `r get_params("latest-daily")$statistic_id`
+#' @param time `r get_params("latest-daily")$time`


time seems like a weird input here. I tried the following:

test = read_waterdata_latest_daily(monitoring_location_id = "USGS-12456500", time = "2018-02-12T23:20:50Z")

And it returns zero results. Is there ever a time this input would be helpful? Perhaps I used it incorrectly.

I think if you did an exact time, probably not. But you could maybe have a scenario where you have a list of sites or bounding box and say "give me the latest results for for the last 30 days", and that way you'd only get sites back that have new data (instead of maybe the latest data point was 30 years ago or something).

ehinman · 2025-11-19T16:55:32Z

R/read_waterdata_latest_daily.R

+#' @param time `r get_params("latest-daily")$time`
+#' @param value `r get_params("latest-daily")$value`
+#' @param unit_of_measure `r get_params("latest-daily")$unit_of_measure`
+#' @param approval_status `r get_params("latest-daily")$approval_status`


Suggest adding somewhere (whether in function documentation or in examples) that the "latest" will often be "Provisional" unless the gage is currently non-functional and the latest is from a while ago.

I would prefer we petition the WaterData API folks to do that so our documentations don't start to deviate.

ehinman · 2025-11-19T16:56:29Z

R/read_waterdata_latest_daily.R

+#' 
+#' \donttest{
+#' site <- "USGS-02238500"
+#' pcode <- "00060"


You set the pcode here but do not use it below in the examples.

ehinman · 2025-11-19T17:02:35Z

R/read_waterdata_latest_daily.R

+#' @param skipGeometry This option can be used to skip response geometries for
+#' each feature. The returning object will be a data frame with no spatial
+#' information.
+#' @param convertType logical, defaults to `TRUE`. If `TRUE`, the function


The last_modified column is POSIXct regardless of whether convertType is TRUE or FALSE.

ehinman

This looks good, Laura. Thanks for the opportunity to review. I went through the diffs, tested the latest daily function, and took a look at the built documentation page for the new function.

Left some minor comments peppered throughout. My only additional thought is about the time input. Your response makes total sense for filtering to sites that have a latest daily measurement within a certain time range. One thing I noticed when looking at this example:
test = read_waterdata_latest_daily(bbox = c(-112.249246,40.516572,-111.293435,41.247515), parameter_code = '00060', convertType = TRUE, time = "2025-11-18T00:00:00Z/..")

As expected, it returns all latest daily measurements taken yesterday, but the last_modified column is from ~6:30AM UTC (midnight-ish central time, I think?) on 11-19-2025. Presumably, this is when the "daily" calculations are made for this group of gages. I'm wondering if somewhere we should alert people that (depending on their time zone) they shouldn't expect the latest daily from the previous day to be available at midnight, but maybe one or two hours after midnight: for automated pulls and the like.

jzemmels

Updates look good. Tested out the new function a few ways and didn't run into issues, other than a minor comment. Approved.

Also tested out building the documentation/pkgdown sites. Some findings, which you're free to ignore:

pkgdown.yml needs to have read_waterdta_latest_daily added to the reference section.
These lines threw an error when building vignettes. The fix was to remove the list() wrapping around function arguments.
Suggest adding zoo and maps packages to DESCRIPTION Suggest section, as they're required for running the vignettes

jzemmels · 2025-11-20T22:53:35Z

R/read_waterdata.R

I'm trying to remember: was there a reason why we didn't want to add parameter-codes as a possible endpoint for this function? I tested out a few CQL statements and it seemed to work.

The zoo and maps packages are only in vignettes that we added to .Rbuildignore. We want the CRAN version of dataRetrieval to be nice and streamlined with as few imports and suggested packages as possible. So that is why they are not on the Suggest list.

And whoo-boy...that movingAverage vignette is old-school! If you can't tell from the graphs it is the precursor to HASP. I feel like we should delete all the text and say if you are interested in doing those calculations, see:
https://doi-usgs.github.io/HASP/
(which I am very very close to pushing a big update that will use all the waterdata functions).

The read_waterdata function was first developed before the parameter-code was an endpoint. I'll update it - it also needs field measurements and now latest-daily.

jzemmels · 2025-11-20T23:13:29Z

R/read_waterdata_latest_daily.R

+                                 time = NA_character_,
+                                 bbox = NA,
+                                 limit = NA,
+                                 max_results = NA,


Perhaps I'm still misunderstanding the usage of these two arguments, but max_results doesn't seem to change the size of the output. Examples:

read_waterdata_latest_daily(monitoring_location_id = "USGS-01491000")
read_waterdata_latest_daily(monitoring_location_id = "USGS-01491000", max_results = 10)
read_waterdata_latest_daily(monitoring_location_id = "USGS-01491000", limit = 10)
read_waterdata_latest_daily(monitoring_location_id = "USGS-01491000", max_results = 5, limit = 10)

The same happens for other read_waterdata_ functions.

fixed! args got moved around and max_results became meaningless. Should be good to go now

jzemmels · 2025-11-21T21:46:53Z

R/walk_pages.R

+      return_list$value <- as.numeric()
+    }
+
+    if(convertType && "contributing_drainage_area" %in% names(return_list)){


Just ran into a new, rather specific error.

Reprex:

dplyr::bind_rows( read_waterdata_monitoring_location(monitoring_location_id = "USGS-092403167282001"), read_waterdata_monitoring_location(monitoring_location_id = "USGS-01435000") ) class(read_waterdata_monitoring_location(monitoring_location_id = "USGS-092403167282001")$drainage_area) class(read_waterdata_monitoring_location(monitoring_location_id = "USGS-01435000")$drainage_area)

Only dplyr::bind_rows complains, rbind just casts the numeric drainage area as a character.

I may be wrong, but I think adding an if statement for "drainage_area" here should fix it.

I didn't realize there was contributing_drainage_area and drainage_area. We have the contributing_drainage_area in there, just need to add drainage_area to the list in cleanup_cols

ehinman · 2025-11-24T18:52:56Z

R/read_waterdata_latest_daily.R

+                                 convertType = TRUE){
+
+  service <- "latest-daily"
+  output_id <- "latest_daily_id"


Random thought: should this just be "daily_id"?

~~yup!~~ nope

ldecicco-USGS · 2025-11-24T19:53:07Z

This looks good, Laura. Thanks for the opportunity to review. I went through the diffs, tested the latest daily function, and took a look at the built documentation page for the new function.

Left some minor comments peppered throughout. My only additional thought is about the time input. Your response makes total sense for filtering to sites that have a latest daily measurement within a certain time range. One thing I noticed when looking at this example: test = read_waterdata_latest_daily(bbox = c(-112.249246,40.516572,-111.293435,41.247515), parameter_code = '00060', convertType = TRUE, time = "2025-11-18T00:00:00Z/..")

As expected, it returns all latest daily measurements taken yesterday, but the last_modified column is from ~6:30AM UTC (midnight-ish central time, I think?) on 11-19-2025. Presumably, this is when the "daily" calculations are made for this group of gages. I'm wondering if somewhere we should alert people that (depending on their time zone) they shouldn't expect the latest daily from the previous day to be available at midnight, but maybe one or two hours after midnight: for automated pulls and the like.

I'm going to keep this in mind for the continuous endpoint (coming next!), where we have to deal a lot more with timezones.

ldecicco-USGS and others added 30 commits August 8, 2025 09:28

Link to internal page removed

a2b7001

Merge branch 'develop' of github.com:DOI-USGS/dataRetrieval into develop

bdf1ca0

Updates from review comments

deb6c56

Slide cleanup

90bd432

Merge branch 'develop' of github.com:DOI-USGS/dataRetrieval into develop

433a230

Merge pull request DOI-USGS#812 from DOI-USGS/develop

7e17eea

Field measurements

Merge branch 'main' of github.com:DOI-USGS/dataRetrieval into develop

0dd8860

deal with empties

25f4a34

fix test

9894b56

respect convertType

4d2312b

don't set defaults

1da5e0e

Ignore on build

8cfa9f1

skipGeometry in read_waterdata

b878a78

Merge pull request DOI-USGS#823 from ldecicco-USGS/main

1bbf4b3

Empties

remove link

8f1c9ca

docs

14481dd

Merge branch 'main' of github.com:DOI-USGS/dataRetrieval

ae6995e

message

8b29b30

upstream

57ea2df

Merge branch 'main' of github.com:DOI-USGS/dataRetrieval into develop # Conflicts: # tutorials/images/publish.png

Merge branch 'main' of https://code.usgs.gov/water/dataRetrieval into…

b1a9695

… develop

Merge pull request DOI-USGS#824 from ldecicco-USGS/develop

08d8bf2

Develop

from upstream

9479707

Merge branch 'main' of github.com:DOI-USGS/dataRetrieval # Conflicts: # tutorials/basic_slides_deck.qmd # tutorials/changes_slides_deck.qmd

Merge pull request DOI-USGS#825 from ldecicco-USGS/main

32bc65f

Fixing docs

cleanup table

442a1a2

Update screenshots

337d238

Merge branch 'main' of github.com:DOI-USGS/dataRetrieval

6214bee

Add metadata slide

25bfbb7

Merge pull request DOI-USGS#828 from ldecicco-USGS/main

2e42407

Update docs

Update dependency requirement: curl >=7.0.0

e904d66

Merge pull request DOI-USGS#829 from abnerbog/update_pkgs

e92a2d5

Update dependency requirement: `curl` >=7.0.0

ldecicco-USGS temporarily deployed to CI_config November 17, 2025 17:38 — with GitHub Actions Inactive

Adding huc and state to ts_meta (new!)

148a39c

ldecicco-USGS temporarily deployed to CI_config November 17, 2025 18:39 — with GitHub Actions Inactive

ldecicco-USGS requested a review from ehinman November 18, 2025 14:56

get better link

e5ab61b

ehinman reviewed Nov 18, 2025

View reviewed changes

ehinman reviewed Nov 19, 2025

View reviewed changes

ehinman approved these changes Nov 19, 2025

View reviewed changes

jzemmels approved these changes Nov 21, 2025

View reviewed changes

jzemmels mentioned this pull request Nov 21, 2025

Add waterdata infrastructure DOI-USGS/dataretrieval-python#183

Merged

jzemmels reviewed Nov 21, 2025

View reviewed changes

Fixes from review

05833a0

ldecicco-USGS temporarily deployed to CI_config November 24, 2025 18:33 — with GitHub Actions Inactive

ehinman reviewed Nov 24, 2025

View reviewed changes

add drainage area to cleanup_cols

33cca39

ldecicco-USGS temporarily deployed to CI_config November 24, 2025 19:42 — with GitHub Actions Inactive

Add note to moving averages

1f35505

ldecicco-USGS temporarily deployed to CI_config November 24, 2025 19:50 — with GitHub Actions Inactive

ldecicco-USGS merged commit 0d998c4 into DOI-USGS:develop Nov 24, 2025
1 check passed

	That's a LOT of columns that come back. We won't look at them here, but you can use `View` in RStudio to explore on your own.
	That's a LOT of columns that came back. We won't look at them here, but you can use `View` in RStudio to explore on your own.

Latest daily #830

Latest daily #830

Uh oh!

Conversation

ldecicco-USGS commented Nov 17, 2025

Uh oh!

ldecicco-USGS commented Nov 18, 2025

Uh oh!

ehinman commented Nov 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ehinman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ehinman left a comment

Choose a reason for hiding this comment

Uh oh!

jzemmels left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ldecicco-USGS Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ldecicco-USGS commented Nov 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ldecicco-USGS Nov 24, 2025 •

edited

Loading