Skip to content

Commit b5439d0

Browse files
authored
Merge pull request #151 from mountainMath/1996
Merging dev into master. Will update to CRAN later this week. I'm thinking we roll this out now with the changes to make 1996 work seamlessly and worry about the rest later. Not sure yet how we want to surface the geometry intersection capabilities - let's leave that for 0.3.3.
2 parents ebc6b8a + 724d570 commit b5439d0

42 files changed

Lines changed: 268 additions & 197 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

DESCRIPTION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Package: cancensus
22
Type: Package
33
Title: Access, Retrieve, and Work with Canadian Census Data and Geography
4-
Version: 0.3.1
4+
Version: 0.3.2
55
Authors@R: c(
66
person("Jens", "von Bergmann", email = "jens@mountainmath.ca", role = c("aut"), comment = "API creator and maintainer"),
77
person("Dmitry", "Shkolnik", email = "shkolnikd@gmail.com", role = c("aut", "cre"), comment = "Package maintainer, responsible for correspondence"),

NEWS.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,12 @@
1+
# cancensus 0.3.2
2+
3+
## Major changes
4+
- Support for 1996 census
5+
- Public availability of dissemination block area level data
6+
7+
## Minor changes
8+
- Fixes [bug](https://github.com/mountainMath/cancensus/issues/150) in `find_census_vectors()`
9+
110
# cancensus 0.3.1
211

312
## Minor changes

R/cancensus.R

Lines changed: 13 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
#'
1010
#' @param dataset A CensusMapper dataset identifier.
1111
#' @param regions A named list of census regions to retrieve. Names must be valid census aggregation levels.
12-
#' @param level The census aggregation level to retrieve, defaults to \code{"Regions"}. One of \code{"Regions"}, \code{"PR"}, \code{"CMA"}, \code{"CD"}, \code{"CSD"}, \code{"CT"} or \code{"DA"}.
12+
#' @param level The census aggregation level to retrieve, defaults to \code{"Regions"}. One of \code{"Regions"}, \code{"PR"}, \code{"CMA"}, \code{"CD"}, \code{"CSD"}, \code{"CT"}, \code{"DA"}, \code{"EA"} (for 1996), or \code{"DB"} (for 2001-2016).
1313
#' @param vectors An R vector containing the CensusMapper variable names of the census variables to download. If no vectors are specified only geographic data will get downloaded.
1414
#' @param geo_format By default is set to \code{NA} and appends no geographic information. To include geographic information with census data, specify one of either \code{"sf"} to return an \code{\link[sf]{sf}} object (requires the \code{sf} package) or \code{"sp"} to return a \code{\link[sp]{SpatialPolygonsDataFrame-class}} object (requires the \code{rgdal} package).
1515
#' @param labels Set to "detailed" by default, but truncated Census variable names can be selected by setting labels = "short". Use \code{label_vectors(...)} to return variable label information in detail.
@@ -54,11 +54,11 @@ get_census <- function (dataset, regions, level=NA, vectors=c(), geo_format = NA
5454

5555
# Turn the region list into a valid JSON dictionary.
5656
if (is.character(regions)) {
57-
if (!quiet) warning(paste("passing `regions` as a character vector is",
57+
if (!quiet) warning(paste("Passing `regions` as a character vector is",
5858
"depreciated, and will be removed in future",
5959
"versions"))
6060
} else if (is.null(names(regions)) || !all(names(regions) %in% VALID_LEVELS)) {
61-
stop("regions must be composed of valid census aggregation levels.")
61+
stop("Regions must be composed of valid census aggregation levels.")
6262
} else {
6363
regions <- jsonlite::toJSON(lapply(regions,as.character)) # cast to character in case regions are supplied as numeric/interger
6464
}
@@ -81,7 +81,7 @@ get_census <- function (dataset, regions, level=NA, vectors=c(), geo_format = NA
8181

8282
# Check if the aggregation level is valid.
8383
if (!level %in% VALID_LEVELS) {
84-
stop("the `level` parameter must be one of 'Regions', 'PR', 'CMA', 'CD', 'CSD', 'CT', or 'DA'")
84+
stop("the `level` parameter must be one of 'Regions', 'PR', 'CMA', 'CD', 'CSD', 'CT', 'DA', 'EA' or 'DB'")
8585
}
8686

8787
# Check that we can read the requested geo format.
@@ -115,26 +115,16 @@ get_census <- function (dataset, regions, level=NA, vectors=c(), geo_format = NA
115115
httr::GET(url)
116116
}
117117
handle_cm_status_code(response, NULL)
118-
na_strings <- c("x", "F", "...", "..", "-","N","*","**")
119-
120-
as.num = function(x, na.strings = "NA") {
121-
stopifnot(is.character(x))
122-
na = x %in% na.strings
123-
x[na] = 0
124-
x = as.numeric(x)
125-
x[na] = NA_real_
126-
x
127-
}
118+
128119

129120
# Read the data file and transform to proper data types
130121
result <- if (requireNamespace("readr", quietly = TRUE)) {
131122
# Use readr::read_csv if it's available.
132123
httr::content(response, type = "text", encoding = "UTF-8") %>%
133-
readr::read_csv(na = na_strings,
124+
readr::read_csv(na = cancensus_na_strings,
134125
col_types = list(.default = "c")) %>%
135126
dplyr::mutate_at(c(dplyr::intersect(names(.),c("Population","Households","Dwellings","Area (sq km)")),
136-
names(.)[grepl("v_",names(.))]),
137-
as.num,na.strings=na_strings) %>%
127+
names(.)[grepl("v_",names(.))]), as.num) %>%
138128
dplyr::mutate(Type = as.factor(.data$Type),
139129
`Region Name` = as.factor(.data$`Region Name`))
140130
} else {
@@ -143,8 +133,7 @@ get_census <- function (dataset, regions, level=NA, vectors=c(), geo_format = NA
143133
utils::read.csv(colClasses = "character", stringsAsFactors = FALSE, check.names = FALSE) %>%
144134
dplyr::as_tibble(.name_repair = "minimal") %>%
145135
dplyr::mutate_at(c(dplyr::intersect(names(.),c("Population","Households","Dwellings","Area (sq km)")),
146-
names(.)[grepl("v_",names(.))]),
147-
as.num,na.strings=na_strings) %>%
136+
names(.)[grepl("v_",names(.))]), as.num) %>%
148137
dplyr::mutate(Type = as.factor(.data$Type),
149138
`Region Name` = as.factor(.data$`Region Name`))
150139
}
@@ -188,8 +177,8 @@ get_census <- function (dataset, regions, level=NA, vectors=c(), geo_format = NA
188177
geos
189178
} else if (!is.na(geo_format)) {
190179
# the sf object needs to be first in join to retain all spatial information
191-
dplyr::select(result, -.data$Population, -.data$Dwellings,
192-
-.data$Households, -.data$Type) %>%
180+
to_remove <- setdiff(dplyr::intersect(names(geos),names(result)),"GeoUID")
181+
dplyr::select(result, -dplyr::one_of(to_remove)) %>%
193182
dplyr::inner_join(geos, ., by = "GeoUID")
194183
}
195184
}
@@ -255,7 +244,7 @@ get_census_geometry <- function (dataset, regions, level=NA, geo_format = "sf",
255244

256245
# This is the set of valid census aggregation levels, also used in the named
257246
# elements of the `regions` parameter.
258-
VALID_LEVELS <- c("Regions","C","PR", "CMA", "CD", "CSD", "CT", "DA", "DB")
247+
VALID_LEVELS <- c("Regions","C","PR", "CMA", "CD", "CSD", "CT", "DA", 'EA', "DB")
259248

260249
#' Query the CensusMapper API for available datasets.
261250
#'
@@ -330,9 +319,6 @@ dataset_attribution <- function(datasets){
330319

331320
commons %>% lapply(function(c){
332321
matches <- attribution[grepl(paste0("^",c,"$"),attribution)]
333-
334-
#years <- stringr::str_extract(matches, "\\d{4}") %>% sort()
335-
# avoid stringr dependency
336322
parts <- strsplit(c, split = "\\\\d\\{4\\}") %>%
337323
unlist()
338324
years <- matches
@@ -347,8 +333,6 @@ dataset_attribution <- function(datasets){
347333
paste0(collapse="; ")
348334
}
349335

350-
351-
352336
#' Return Census variable names and labels as a tidy data frame
353337
#'
354338
#' @param x A data frame, \code{sp} or \code{sf} object returned from
@@ -443,7 +427,7 @@ transform_geo <- function(g, level) {
443427
g <- g %>%
444428
dplyr::mutate_at(dplyr::intersect(names(g), as_character), as.character) %>%
445429
dplyr::mutate_at(dplyr::intersect(names(g), as_numeric), as.numeric) %>%
446-
dplyr::mutate_at(dplyr::intersect(names(g), as_integer), as.integer) %>%
430+
dplyr::mutate_at(dplyr::intersect(names(g), as_integer), as.int) %>%
447431
dplyr::mutate_at(dplyr::intersect(names(g), as_factor), as.factor)
448432

449433
# Change names
@@ -464,7 +448,7 @@ transform_geo <- function(g, level) {
464448
c('ruid','CT_UID'),
465449
c('rguid','CMA_UID'))
466450
}
467-
if (level=='DA') {
451+
if (level=='DA'|level=='EA') {
468452
name_change <- name_change %>% rbind(
469453
c('rpid','CSD_UID'),
470454
c('rgid','CD_UID'),

R/census_regions.R

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,9 +36,8 @@
3636
#' @examples
3737
#' list_census_regions('CA16')
3838
list_census_regions <- function(dataset, use_cache = TRUE, quiet = FALSE) {
39-
dataset = toupper(dataset)
4039
cache_file <- file.path(tempdir(),paste0(dataset, "_regions.rda"))
41-
#cache_file <- cache_path(dataset, "_regions.rda")
40+
4241
if (!use_cache || !file.exists(cache_file)) {
4342
if (!quiet) message("Querying CensusMapper API for regions data...")
4443
response <- httr::GET(paste0("https://censusmapper.ca/data_sets/", dataset,

R/census_vectors.R

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,6 @@
2525
#' # List all vectors for a given Census dataset in CensusMapper
2626
#' list_census_vectors('CA16')
2727
list_census_vectors <- function(dataset, use_cache = TRUE, quiet = TRUE) {
28-
#cache_file <- cache_path(dataset, "_vectors.rda")
2928
cache_file <- file.path(tempdir(),paste0(dataset, "_vectors.rda"))
3029
if (!use_cache || !file.exists(cache_file)) {
3130
url <- paste0(cancensus_base_url(),"/api/v1/vector_info/", dataset, ".csv")
@@ -56,6 +55,8 @@ list_census_vectors <- function(dataset, use_cache = TRUE, quiet = TRUE) {
5655
grepl("^3.", add) ~ gsub(".", ", ", gsub("^3\\.", "Median of ", add),
5756
fixed = TRUE),
5857
grepl("^4.", add) ~ gsub(".", ", ", gsub("^4\\.", "Average to ", add),
58+
fixed = TRUE),
59+
grepl("^9.", add) ~ gsub(".", ", ", gsub("^9\\.", "Standard error based on ", add),
5960
fixed = TRUE)
6061
)) %>%
6162
dplyr::select(.data$vector, .data$type, .data$label, .data$units,

R/helpers.R

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,28 @@ dataset_from_vector_list <- function(vector_list){
3232
dataset
3333
}
3434

35-
# To activate later
36-
# valid_datasets <- c("CA01","CA06","CA11","CA16",
35+
cancensus_na_strings <- c("x", "F", "...", "..", "-","N","*","**")
36+
37+
as.num = function(x, na.strings = cancensus_na_strings) {
38+
stopifnot(is.character(x))
39+
na = x %in% na.strings
40+
x[na] = 0
41+
x = as.numeric(x)
42+
x[na] = NA_real_
43+
x
44+
}
45+
46+
as.int = function(x, na.strings = cancensus_na_strings) {
47+
stopifnot(is.character(x))
48+
na = x %in% na.strings
49+
x[na] = 0
50+
x = as.integer(x)
51+
x[na] = NA_integer_
52+
x
53+
}
54+
55+
# List of eligible datasets
56+
# VALID_DATASETS <- c("CA1996","CA01","CA06","CA11","CA16",
3757
# "CA01xSD", "CA06xSD", "CA11xSD", "CA16xSD",
3858
# "TX2000", "TX2001", "TX2002", "TX2003", "TX2004",
3959
# "TX2005", "TX2006", "TX2007", "TX2008", "TX2009",

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,8 @@ Access, retrieve, and work with Canadian Census data and geography.
1212
* Download data and Census geography in tidy and analysis-ready format
1313
* Convenience tools for searching for and working with Census regions and variable hierarchies
1414
* Provides Census geography in multiple R spatial formats
15-
* Provides data and geography at multiple Census geographic levels including province, Census Metropolitan Area, Census Division, Census Subdivision, Census Tract, and Dissemination Areas
16-
* Provides data for the 2016, 2011, 2006, and 2001 Census releases
15+
* Provides data and geography at multiple Census geographic levels including province, Census Metropolitan Area, Census Division, Census Subdivision, Census Tract, Dissemination Areas, Enumeration Areas (for 1996), and Dissemination Blocks (for 2001-2016)
16+
* Provides data for the 2016, 2011, 2006, 2001, and 1996 Census releases
1717
* Access to taxfiler data at the Census Tract level for tax years 2000 through 2017
1818

1919
### Reference
@@ -35,7 +35,7 @@ library(cancensus)
3535

3636
### API key
3737

38-
This package relies on queries to the CensusMapper API, which requires a Censusmapper API key. You can obtain a free API key by [signing up](https://censusmapper.ca/users/sign_up) for a CensusMapper account. CensusMapper API keys are free; however, API requests are limited in volume. For larger quotas, please get in touch with Jens [directly](mailto:jens@censusmapper.ca).
38+
**cancensus** requires a valid CensusMapper API key to use. You can obtain a free API key by [signing up](https://censusmapper.ca/users/sign_up) for a CensusMapper account. CensusMapper API keys are free and public API quotas are generous; however, due to incremental costs of serving large quantities of data, there limits to API usage in place. For most use cases, these API limits should not be an issue. Production uses with large extracts of fine grained geographies may run into API quota limits. For larger quotas, please get in touch with Jens [directly](mailto:jens@censusmapper.ca).
3939

4040
To check your API key, just go to "Edit Profile" (in the top-right of the CensusMapper menu bar). Once you have your key, you can store it in your system environment so it is automatically used in API calls. To do so just enter `options(cancensus.api_key = "your_api_key")`.
4141

@@ -45,7 +45,7 @@ For performance reasons, and to avoid unnecessarily drawing down API quotas, **c
4545

4646
### Currently available datasets
4747

48-
**cancensus** can access Statistics Canada Census data for the 2001 Census, the 2006 Census, the 2011 Census and National Household Survey, as well as the 2016 Census. You can run `list_census_datasets` to check what datasets are currently available for access through the CensusMapper API. Additional data for the 2016 Census will be included in Censusmapper within a day or two after public release by Statistics Canada. Statistics Canada maintains a release schedule for the Census 2016 Program which can be viewed on their [website](http://www12.statcan.gc.ca/census-recensement/2016/ref/release-dates-diffusion-eng.cfm).
48+
**cancensus** can access Statistics Canada Census data for Census years 1996, 2001, 2006, 2011, and 2016. You can run `list_census_datasets` to check what datasets are currently available for access through the CensusMapper API. Additional data for the 2016 Census will be included in Censusmapper within a day or two after public release by Statistics Canada. Statistics Canada maintains a release schedule for the Census 2016 Program which can be viewed on their [website](http://www12.statcan.gc.ca/census-recensement/2016/ref/release-dates-diffusion-eng.cfm).
4949

5050
Thanks to contributions by the Canada Mortgage and Housing Corporation (CMHC), **cancensus** now includes additional Census-linked datasets as open-data releases. These include annual taxfiler data at the census tract level for tax years 2000 through 2017, which includes data on incomes and demographics, as well as specialized crosstabs for Structural type of dwelling by Document type, which details occupancy status for residences. These crosstabs are available for the 2001, 2006, 2011, and 2016 Census years at all levels starting with census tract.
5151

@@ -107,7 +107,7 @@ We'd love to feature examples of work or projects that use cancensus.
107107
If you wish to cite cancensus:
108108

109109
von Bergmann, J., Aaron Jacobs, Dmitry Shkolnik (2020). cancensus: R package to
110-
access, retrieve, and work with Canadian Census data and geography. v0.3.1.
110+
access, retrieve, and work with Canadian Census data and geography. v0.3.2.
111111

112112

113113
A BibTeX entry for LaTeX users is
@@ -116,7 +116,7 @@ A BibTeX entry for LaTeX users is
116116
author = {Jens {von Bergmann} and Dmitry Shkolnik and Aaron Jacobs},
117117
title = {cancensus: R package to access, retrieve, and work With Canadian Census data and geography},
118118
year = {2020},
119-
note = {R package version 0.3.1},
119+
note = {R package version 0.3.2},
120120
url = {https://mountainmath.github.io/cancensus/},
121121
}
122122
```

cran-comments.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
1+
## Update - v.0.3.2
2+
3+
- Add functionality for 1996 census and more refined geographies
4+
- Expanded vignettes
5+
- Fix minor bugs flagged by users
6+
17
## Update - v.0.3.1
28

39
Addressing warning and note in CRAN checks from upload v.0.3.0

docs/404.html

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

docs/LICENSE-text.html

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)