Proposal: expose docker cache-from capability

## Expected Behaviour




When building a function using `faas-cli build`, it should be possible to reference an external source for the docker build/ayer cache, the most common case would be referencing a pervious build. When enabled, it would allow infrequently changed steps, for example `pip install` to be cached between builds and reduce the total build time.

## Current Behaviour






Only the local build cache can be used by `faas-cli build`. This is most noticeable in CI/CD workflows where the docker builder is often isolated and new between each build. For example, in Github Actions, this seems to be the case. Because the build cache starts empty, every layer of the function build must be rebuilt, even if only a small change was made at the end of the docker file. This is very noticable in Python and NodeJS projects where the final step is often just copying a small amount of function code, but there is often a slow `pip install` or `npm -i`.

## Why do you need this?

This will improve build times in CI/CD environments.



## Who is this for?

I work at [Contiamo](contiamo.com) but the feature could benefit any function build

## Are you a GitHub Sponsor (Yes/No?)



Check at: https://github.com/sponsors/openfaas

- [x] Yes
- [ ] No

## List All Possible Solutions and Workarounds





1. one possible solution is to add `docker pull` as a build step prior to running `faas-cli build`. This would seed the docker cache with the relevant images. This has a disadvantage that it will _always_ pull those images even if they could not be used in the cache. This requires time and uses network bandwidth that wasn't needed, actually making the build longer than if that step was skipped. In some of the other options we will see that it can be more efficient.

2. Use the `--shrinkwrap` flag to prepare the build context and then use `docker build --cache-from` to pass references to the candidate images for the build cache.

3. Allow passing a flag or set a yaml config for `faas-cli build` so that it can set the `--cache-from` flag. This would allow the `inline` cache mode when the user also passes the `BUILDKIT_INLINE_CACHE=1` build arg.

4. Allow passing arbitrary flags=value pairs to `faas-cli build` so that the developer can set the appropriate flag: `--cache-from`, if using `docker build`, or `--cache-from` / `--cache-to`, when using `docker buildx`.

## Which Solution Do You Recommend?



I think either 3 or 4 are the best alternatives, option 3 is most focused on just this problem, but option 4 would be the most flexible.

For any solution, I think the feature shoudl be _opt in_, meaning the feature does not automatically enable additional caching by default. The developer must explicitly enable this additional caching behavior.

### Option 3 experience / implementation

For option 3 I think we could implement this as just providing the `--cache-from` flag in `faas-cli` and then adding a new section to the function spec.

The DX would look like this

```sh
faas-cli build -f stack.yaml --cache-from=ghcr.io/lucasroesler/my-function:latest,ghcr.io/lucasroesler/my-other-function:latest --build-arg BUILDKIT_INLINE_CACHE=1
```

Alternatively, in the YAML it could look like this

```yaml
version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  telephone:
    lang: python3-flask-debian
    handler: ./telephone
    image: ghcr.io/lucasroesler/my-function:latest
    build_cache:
      from:
        - ghcr.io/lucasroesler/my-function:latest
        - ghcr.io/lucasroesler/other-cache-candidate:latest
    build_args:
      BUILDKIT_INLINE_CACHE: 1
```

In both configurations, the values should simply be passed to the `--cache-from` as is without modification. This will then allow usage of the advanced options when `buildx` is explicitly enabled in the environment, for example, `"type=local,src=path/to/dir"`

```
--cache-from stringArray        External cache sources (e.g., "user/app:cache","type=local,src=path/to/dir")
```

It would be nice to also support the `cache-to` flag from buildx, but this flag is not supported by the default docker build and would cause an error. However, it allows for much more advanced caching options, such as storing the cache locally in a folder, in a blob storage, or in a registry. It also allows "max" mode, caching all of the build layers, including intermediate layers from multi-stage builds. This provides significantly more cache hit opportunities. If we want to allow this opportunity, but add the required documentation, I think it could look like this

```yaml
version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  telephone:
    lang: python3-flask-debian
    handler: ./telephone
    image: ghcr.io/lucasroesler/my-function:latest
    build_cache:
      from:
        - ghcr.io/lucasroesler/my-function:latest
        - ghcr.io/lucasroesler/other-cache-candidate:latest
      to:
        - ghcr.io/lucasroesler/my-function:latest
    build_args:
      BUILDKIT_INLINE_CACHE: 1
```

or

```yaml
version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  telephone:
    lang: python3-flask-debian
    handler: ./telephone
    image: ghcr.io/lucasroesler/my-function:latest
    build_cache:
      from:
        - ghcr.io/lucasroesler/my-function:cache
        - ghcr.io/lucasroesler/other-cache-candidate:cache
      to:
        - type=registry,ref=ghcr.io/lucasroesler/my-function:cache,mode=max
```

### Option 4 experience / implementation

Option 4 enables the same experience, but would look like this

```sh
faas-cli build -f stack.yaml  --builder-flag "--cache-from=ghcr.io/lucasroesler/my-function:latest,ghcr.io/lucasroesler/my-other-function:latest" --build-arg BUILDKIT_INLINE_CACHE=1
```

and

```yaml
version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  telephone:
    lang: python3-flask-debian
    handler: ./telephone
    image: ghcr.io/lucasroesler/my-function:latest
    builder_flags:
      - --cache-from=ghcr.io/lucasroesler/my-function:latest,ghcr.io/lucasroesler/other-cache-candidate:latest
    build_args:
      BUILDKIT_INLINE_CACHE: 1
```

### Caching impact

#### inline caching

There are several styles and options of docker layer caching enabled by `docker` and `buildkit`, note that buildkit is required for this feature.

The first and simplest is called `inline` caching. This adds some additional metadata to the image config to indicate that the layers can be reused in build caches. It is only some additional metadata in the docker manifest config and it requires that the image is built with this `inline` cache enabled. The result has _no_ impact an the actual image or layer sizes because it is only additional metadata that is pushed to the remote registry.
DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 -t caching-test:with-cache .

I tested this with an image and the build size was the same with and without the inline cache. This can be tested with any docker image

```sh
DOCKER_BUILDKIT=1 docker build caching-test:without-cache .
DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 -t caching-test:with-cache .
docker images | grep "cacheing-test"
```

During subsequent builds, the builder will download _just_ this metadata to determine if a cache hit is possible and then, only when it is useful, download the actual layer data. The result has _no_ impact an the actual image or layer sizes because it is only additional metadata that is pushed to the remote registry.

#### other caching modes

Note that there are two caching modes, `min` and `max`. The inline caching will use `min` mode, which is why it has no impact on the final size, it is just a tiny amount of metadata.

With `max` mode _all_ build layers, including ephemeral multi-stage build layers are saved. This clearly results in more data, but is not supported by the `inline` cache type. Instead these layers can be exported to a local folder, a blob storage, or to a docker registry.

To use these other destinations or the max mode, we would need to enable support for the `--cache-to` flag.

#### Additional caching background

This feature is also implemented in the `docker-build-push` Github action, see here https://github.com/docker/build-push-action/blob/master/docs/advanced/cache.md. This could provide a good example for how to document the feature.

Relevant docs about docker/buildkit caching:

- https://github.com/moby/buildkit#export-cache
- https://docs.docker.com/build/building/cache/backends/inline/
- https://github.com/docker/buildx/blob/master/docs/guides/cache/index.md
- https://docs.docker.com/engine/reference/commandline/build/#specifying-external-cache-sources

## Context




I have a python function that we build frequently because it bundles a machine learning model in the image. As a result, the last layer is just copying the machine learning model but all of the other layers (the dependencies and the function code) are not frequently changing.

In our CI/CD system (github actions) the build cache is always empty, which means our builds spend a lot of time on the `apt get` and `pip install` stages even though these are not actually changing and would normally be skipped when built on my local laptop, where the build cache contains previous versions of the function.

## Your Environment



- FaaS-CLI version ( Full output from: `faas-cli version` ): 0.14.11

- Docker version ( Full output from: `docker version` ):

  ```
  Client: Docker Engine - Community
  Version:           20.10.14
  API version:       1.41
  Go version:        go1.16.15
  Git commit:        a224086
  Built:             Thu Mar 24 01:47:58 2022
  OS/Arch:           linux/amd64
  Context:           default
  Experimental:      true

  Server: Docker Engine - Community
  Engine:
  Version:          20.10.14
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.15
  Git commit:       87a90dc
  Built:            Thu Mar 24 01:45:50 2022
  OS/Arch:          linux/amd64
  Experimental:     false
  containerd:
  Version:          1.5.11
  GitCommit:        3df54a852345ae127d1fa3092b95168e4a88e2f8
  runc:
  Version:          1.0.3
  GitCommit:        v1.0.3-0-gf46b6ba
  docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
  ```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: expose docker cache-from capability #940

Expected Behaviour

Current Behaviour

Why do you need this?

Who is this for?

Are you a GitHub Sponsor (Yes/No?)

List All Possible Solutions and Workarounds

Which Solution Do You Recommend?

Option 3 experience / implementation

Option 4 experience / implementation

Caching impact

inline caching

other caching modes

Additional caching background

Context

Your Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal: expose docker cache-from capability #940

Description

Expected Behaviour

Current Behaviour

Why do you need this?

Who is this for?

Are you a GitHub Sponsor (Yes/No?)

List All Possible Solutions and Workarounds

Which Solution Do You Recommend?

Option 3 experience / implementation

Option 4 experience / implementation

Caching impact

inline caching

other caching modes

Additional caching background

Context

Your Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions