-
-
Notifications
You must be signed in to change notification settings - Fork 880
Description
I wrote the following RMarkdown document to illustrate my issue and proposed solution (just copy it to a file and render it to an output format of your choice):
Let's consider the following function to make a plot based on a dataset:
```{r plot-fun}
# Let's just make a histogram of a given column of a given input data.frame
# labeled with the given name.
plot_fun <- function(df, name, col) {
stopifnot(is.data.frame(df))
ncols <- ncol(df)
stopifnot(ncols > 0)
cnames <- colnames(df)
if(is.character(col)) { col <- match(col, cnames) }
stopifnot(!is.na(col))
stopifnot(is.integer(col))
stopifnot(col <= ncols)
cname <- cnames[col]
col <- df[, col]
stopifnot(is.numeric(col))
hist(col, main = name, xlab = cname)
}
```
Now we can call this function to make a specific plot and control the name of
the resulting file via the `fig.path` and `dev` chunk options along with the
chunk's `label`:
```{r}
#| iris-sepal-width,
#| dev = "png",
#| fig.path = "figures/histograms/one-per-chunk/hist-",
#| fig.cap = paste0("The above image was saved to [`",
#| knitr::fig_path("png", number=1L), "`](",
#| knitr::fig_path("png", number=1L), ").")
# chunk options (repeated for inclusion in rendered output):
# iris-sepal-width, # <- chunk label; Will be part of file name!
# dev = "png", # <- graphical device; Determines the file extension!
# fig.path = "figures/histograms/one-per-chunk/hist-", # <- figure path prefix
# fig.cap = paste0("The above image was saved to [`", # <- figure caption
# knitr::fig_path("png", number=1L), "`](",
# knitr::fig_path("png", number=1L), ").")
plot_fun(iris, "iris", "Sepal.Width")
```
That chunk's output in the rendered document contains the specified plot
without the need to explicitly include it in the source document (other than
(implicitly!) printing it).
Also, without the need to explicitly save the plot to a file in the chunk's
source code, the image is saved as a PNG file under
[`figures/histograms/one-per-chunk/hist-iris-sepal-width-1.png`
](figures/histograms/one-per-chunk/hist-iris-sepal-width-1.png).
Note, that I could have specified several graphical devices, for example to
include a low resolution PNG into my HTML output, a higher resolution PNG
version to be used for my PDF output, and also save a vectorized version in
SVG and/or PDF format that only ends up on disk.
This way, I could browse the small report with all its figure but use a
bigger, better quality version of that figure for a presentation or poster,
without the need to re-render the document.
Finding the corresponding file(s) would be easy in this case due to the
carefully chosen file name.
Now let's expand our plot function to generate that plot for *each* numerical
column of the dataset:
```{r plot-fun-iterative}
plot_fun_iterative <- function(df, name) {
stopifnot(is.data.frame(df))
cols <- colnames(df)[sapply(df, is.numeric)]
for(col in cols) { plot_fun(df, name, col) }
}
```
And now let's further assume we want to generate those plots for a whole range
of datasets:
```{r datasets}
datasets <- list(iris = iris, mtcars = mtcars)
```
We can now loop over the above list to generate all of our plots in a single
code chunk.
But before, for the sake of illustration, let's first add a little helper code
chunk to 'predict' which plots (and in which order!) will be generated in the
actual plotting chunk (see below):
```{r helper-chunk-to-generate-figure-captions}
# Note that this kind of chunk serves an illustrative purpose only and normally
# I would not (have to) include such a chunk in my actual RMarkdown document.
# Generate a data.frame with metadata of all plots (in order!) to be generated
# in the next code chunk (see below):
plots <-
do.call(rbind,
lapply(datasets,
function(df)
data.frame(column = colnames(df)[sapply(df, is.numeric)])))
plots$dataset <- sub("[.].*", "", rownames(plots))
# Pre-compute desired paths (in order!) of the plots to be generated in the
# next code chunk (see below):
desired_paths <- paste0("figures/histograms/many-per-chunk/hist-",
plots$dataset, "-",
gsub("[.]", "-", tolower(plots$column)),
"-1.png")
# Define a helper function to generate the corresponding figure captions given
# the (actual) figure paths (in order!) of the plots to be generated in the
# next code chunk (see below):
fig_cap_fun <- function(actual_paths) {
paste0("The above image was saved to [`", actual_paths, "`](", actual_paths,
") but I wish I could somehow make `knitr` save it to `",
desired_paths, "` instead.")
}
```
Now we can use the above helper function to set each figure's caption referring
to both, its actual, and the desired path:
```{r}
#| hist,
#| dev = "png",
#| fig.path = "figures/histograms/many-per-chunk/",
#| fig.cap = fig_cap_fun(knitr::fig_path("png", number = 1:nrow(plots)))
# chunk options (repeated for inclusion in rendered output):
# hist, # <- chunk label; Will be part of file name!
# dev = "png", # <- graphical device; Determines the file extension!
# fig.path = "figures/histograms/many-per-chunk/-", # <- figure path prefix
# fig.cap = fig_cap_fun(knitr::fig_path("png", number = 1:nrow(plots)))
# ^-- figure captions
for(name in names(datasets)) {
plot_fun_iterative(datasets[[name]], name)
}
```
As you can see, while it is possible to map each figure to a file path of
choice in the corresponding caption, there is currently no way to *save* the
images under those paths.
Thus, when listing the `figures` directory, it is impossible to map a given
figure to the actual data without opening the file and inspecting it:
```{r list-figures}
list.files("figures", recursive=TRUE, full.names=TRUE)
```
Compare that to the path I'd like to use instead:
```{r desired-paths}
c(desired_paths,
list.files("figures/histograms/one-per-chunk", full.names=TRUE))
```
Finally, imagine in addition to looping over datasets and columns like above
I would also have several plot functions (*e.g.* frequency *and* count
histograms, box- and violin plots, maybe even scatter plots of (all!) pairs of
columns, each with several regression models).
As-is, `knitr` allows to generate a vast number of plots for data exploration
with very little code, laying them out nicely and annotating them properly.
But the moment I want to use one of the image files corresponding to a specific
figure in the rendered document elsewhere, it gets very tedious.
Thus, I suggest a new chunk option (`fig.basename`/`fig.stem`?) that
* defaults to `knitr::opts_current$get("label")`,
* accepts character vectors (recycling/truncating if length doesn't match the
number of figures generated), and
* accepts a function that returns a single character but can access the
current state of local (*e.g.* loop) variables.
The final filename would then use this instead of the chunk label and (ideally,
but not necessarily) keep track of the figure numbering per value instead of a
single count per chunk.
This could be implemented by wrapping `knitr::opts_current$get("fig.cur")` into
a function that (optionally) accepts the file name stem (again, defaulting to
the current chunk's label, for backwards compatibility).
But even without that last, more complex, modification, much more meaningful
filenames like this could be generated already:
```{r desired-paths-without-numbering-accounting}
# Reconstruct image paths without restarting the numbering at `1`:
c(sapply(1:length(desired_paths),
function(i) { sub("-1[.]png$", paste0("-", i, ".png"),
desired_paths[i]) }),
list.files("figures/histograms/one-per-chunk", full.names=TRUE))
```I would be willing to contribute a PR but want to make sure it has a chance of being accepted before I invest (any more) time into this.
This issue has been bothering me for years and I finally set down digging through docs and code trying to find a way to make it work but determined it is currently not possible.
My use cases include wanting to:
- (re-)use (the latest version, as of the time of rendering, of) a figure in a(n) (R)Markdown (e.g.
xaringan) presentation, - insert such a figure into an Office document,
- collect a (small) subset of such figures to share with collaborators,
- declare (specific) such figures as outputs to be included in the (Snakemake) report when rendering an (RMarkdown) document in a Snakemake rule,
and several more.
The only solution I found so far is to not rely on knitrs capability to 'print' plots but rather save them manually and include them explicitly which adds a whole lot of boilerplate code to each and every code chunk creating more than one plot.
At least to me, my suggested solution would be a huge improvement and (with my very basic understanding of the inner workings of knitr and related packages) I think it could be done in a fully backwards compatible way.
If you see any side effects or edge cases I did not consider, please let me know.
Thank you for considering my feature request / PR offer and for creating (and maintaining) this wonderful piece of software that has made my life easier for over a decade.