-
Notifications
You must be signed in to change notification settings - Fork 72
Description
Please specify whether your issue is about:
- [ x] a possible bug
- a question about package functionality
- a suggested code or documentation change, improvement to the code, or feature request
Expected behavior: when specifying the area, in some pdfs, it will not respect the actual boundaries given and somehow include text from other portions of the table. I believe it has something to do with tables that go vertically down.
If you are reporting (1) a bug or (2) a question about code, please supply:
- ensure that you can install and successfully load rJava
- a fully reproducible example using a publicly available dataset (or provide your data)
- if an error is occurring, include the output of
traceback()
run immediately after the error occurs - the output of
sessionInfo()
Put your code here:
## rJava loads successfully
# install.packages("rJava")
library("rJava")
## load package
library("tabulizer")
## code goes here
location <- 'https://county.milwaukee.gov/files/county/county-clerk/Election-Commission/ElectionResultsCopy-1/2016Copy-1/Fall-General-ElectionCopy-1/11-8-16CertificationReportCopy-1.pdf'
pagess <- list(3)
area_1 <- list(c(240,169,730,230))
out <- extract_tables(location, pages=pagess, area=area_1, guess=FALSE, )
## session info for your system
sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] shiny_1.5.0 tabulizer_0.2.2 htmltab_0.7.1.1 rjson_0.2.20 RCurl_1.98-1.2 forcats_0.5.0 stringr_1.4.0
[8] purrr_0.3.4 readr_1.4.0 tidyr_1.1.2 tibble_3.0.4 ggplot2_3.3.2 tidyverse_1.3.0 rvest_0.3.6
[15] xml2_1.3.2 dplyr_1.0.2 jsonlite_1.7.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.5 lubridate_1.7.9 png_0.1-7 assertthat_0.2.1 digest_0.6.27
[6] mime_0.9 R6_2.5.0 cellranger_1.1.0 backports_1.1.10 reprex_0.3.0
[11] httr_1.4.2 pillar_1.4.6 rlang_0.4.8 curl_4.3 readxl_1.3.1
[16] rstudioapi_0.11 miniUI_0.1.1.1 tabulizerjars_1.0.1 munsell_0.5.0 tinytex_0.27
[21] broom_0.7.2 compiler_4.0.3 httpuv_1.5.4 modelr_0.1.8 xfun_0.19
[26] pkgconfig_2.0.3 htmltools_0.5.0 tidyselect_1.1.0 XML_3.99-0.5 fansi_0.4.1
[31] crayon_1.3.4 dbplyr_2.0.0 withr_2.3.0 later_1.1.0.1 bitops_1.0-6
[36] grid_4.0.3 xtable_1.8-4 gtable_0.3.0 lifecycle_0.2.0 DBI_1.1.0
[41] magrittr_1.5 scales_1.1.1 cli_2.1.0 stringi_1.5.3 fs_1.5.0
[46] promises_1.1.1 ellipsis_0.3.1 generics_0.1.0 vctrs_0.3.4 tools_4.0.3
[51] glue_1.4.2 hms_0.5.3 fastmap_1.0.1 colorspace_1.4-1 rJava_0.9-13
[56] haven_2.3.1