Skip to content

Added gh_fill.R#35

Open
gvanweerden wants to merge 3 commits into
MichaelChirico:masterfrom
gvanweerden:master
Open

Added gh_fill.R#35
gvanweerden wants to merge 3 commits into
MichaelChirico:masterfrom
gvanweerden:master

Conversation

@gvanweerden
Copy link
Copy Markdown

Added a function to fill geohashes with the contained geohashes.

Initial add of function to fill geohash with contained geohashes
Comment thread R/gh_fill.R
@@ -0,0 +1,28 @@
#' Fill geohash prefix with members
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package isn't currently using roxygen2, so you'll need to either write the .Rd directly, or convert the rest of the package to use roxygen2.

Comment thread R/gh_fill.R


gh_fill <- function(geohashes, precision) {
if (uniqueN(nchar(geohashes)) > 1) {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uniqueN is a data.table function, which we don't import. length(unique()) should be fine here.

Comment thread R/gh_fill.R
unlist(strsplit("0123456789bcdefghjkmnpqrstuvwxyz", split = ""))

grid <-
do.call(data.table::CJ, append(list(geohashes), replicate(new_levels, base32, FALSE)))
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you do a performance comparison of CJ() vs. expand.grid() for typical inputs? E.g. 1e4 inputs and adding 1-3 characters of precision...

Depending on that we'll add the data.table dependency

Comment thread R/gh_fill.R
if (uniqueN(nchar(geohashes)) > 1) {
stop("Input Geohashes must all have the same precision level.")
}
if (sum(grepl("['ailo]", geohashes)) > 0) {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer any(grepl()).

also, elsewhere in the package, we allow upper-case input, we should strive to match case here.

Comment thread R/gh_fill.R
#' @param geohashes Character vector of input geohashes. They must all be of same precision
#' @param precision Positive integer scalar controlling the 'zoom level' – how many characters should be used in the output.
#' @return Character vector of geohashes corresponding to the input.
#' @export
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add examples and tests, it will especially make it easier to review when I can see expected inputs/outputs

Comment thread R/gh_fill.R

new_levels <- precision - nchar(geohashes[1])

base32 <-
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's probably better to run this at compile-time in the package environment, rather than every time gh_fill() is invoked

@MichaelChirico
Copy link
Copy Markdown
Owner

Thanks Garnet! Sorry for the long delay. I no longer work with geohashes very much so this package has lost priority (open to new maintainers!).

If I understand right, the function would indeed be useful. Needs some package "productionization" stuff, and I'd like to justify adding data.table as a dependency if you don't mind.

@MichaelChirico
Copy link
Copy Markdown
Owner

Friendly ping if you'd like this merged, I've got a new version ready to submit soon.

@gvanweerden
Copy link
Copy Markdown
Author

Hi Michael, I have added the changes that you referenced in the earlier messages. I hope it's been done in an okay manner.

I wrote a test to compare data.table::CJ() vs base::expand.grid. Often when I use this function it is to create vectors in the 10's of millions. I'm just not sure if the speed increase warrants a dependency on data.table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants