Skip to content

Local OCI store garbage collection can misidentify dangling blobs and remove them erroneously #1093

@amisevsk

Description

@amisevsk

Description

Background: I currently work on the KitOps project, which uses the oras-go library to manage storing OCI artifacts. In our use case, we use different mediatypes for blob layers to distinguish between types of data in an artifact, though most blobs are packaged in the same way (tarballs of files).

The in-memory graph of a local OCI store stores predecessor and successor maps of descriptors to represent the graph of references. This graph is used in garbage collection in order to prune dangling blobs when e.g. a manifest is removed.

However, the graph stores a map of descriptors while the underlying CAS store on disk only uses digests. If two manifests refer to the same blob digest with a different media type, garbage collection will find the blob to be dangling when either manifest is deleted, rendering the other manifest invalid.

Reproducer

This reproducer uses OCI artifacts I generated with the Kit CLI, tagged sample-one and sample-two here: https://quay.io/repository/amisevsk0/oras-cli?tab=tags. The artifact tagged sample-one has layers

  "layers": [
    {
      "mediaType": "application/vnd.kitops.modelkit.docs.v1.tar",
      "digest": "sha256:65c705d810aa0ec82d1ea886c170462c9f17203aa313bfc2cf01d099d4e5ff96",
      "size": 2048,
    }
  ],

whereas the artifact tagged sample-two has layers

 "layers": [
    {
      "mediaType": "application/vnd.kitops.modelkit.model.v1.tar",
      "digest": "sha256:65c705d810aa0ec82d1ea886c170462c9f17203aa313bfc2cf01d099d4e5ff96",
      "size": 2048,
    }
  ],

These two artifacts refer to the same blob (with digest sha256:65c705d810aa0ec82d1ea886c170462c9f17203aa313bfc2cf01d099d4e5ff96) but use a different mediaType. The underlying file is just a text file containing the text testfile. Below is a Go program that uses the oras-go library to fetch these artifacts to a local OCI store, deletes one of them, and then attempts to copy the the other artifact to another OCI store, which fails (as an example -- anything that requires reading the second artifacts blobs will similarly fail).

package main

import (
	"context"
	"fmt"
	"log"
	"os"
	"path/filepath"

	"oras.land/oras-go/v2"
	"oras.land/oras-go/v2/content/oci"
	"oras.land/oras-go/v2/registry/remote"
)

func dieIfErr(err error) {
	if err != nil {
		log.Fatalf("Unexpected error: %s\n", err)
	}
}

func main() {
	ctx := context.Background()
	tmpDir, err := os.MkdirTemp("", "oras-remove-reproducer-*")
	dieIfErr(err)
	storePathA := filepath.Join(tmpDir, "storage-a")
	storePathB := filepath.Join(tmpDir, "storage-b")

	log.Printf("Setting up local OCI stores in %s and %s\n", storePathA, storePathB)
	localStoreA, err := oci.New(storePathA)
	dieIfErr(err)
	localStoreB, err := oci.New(storePathB)
	dieIfErr(err)

	log.Println("Setting up remote repository to fetch reproducer artifacts")
	remoteRegistry, err := remote.NewRegistry("quay.io")
	dieIfErr(err)
	remoteRepo, err := remoteRegistry.Repository(ctx, "amisevsk0/oras-cli")
	dieIfErr(err)

	log.Println("Copying reproducer artifacts to local OCI store A")
	sampleOneDesc, err := oras.Copy(ctx, remoteRepo, "sample-one", localStoreA, "sample-one", oras.DefaultCopyOptions)
	dieIfErr(err)
	_, err = oras.Copy(ctx, remoteRepo, "sample-two", localStoreA, "sample-two", oras.DefaultCopyOptions)
	dieIfErr(err)

	log.Println("Removing sample-one tagged manifest from local store A")
	err = localStoreA.Delete(ctx, sampleOneDesc)
	dieIfErr(err)

	log.Println("Copying sample-two tagged manifest from local store A to local store B")
	_, err = oras.Copy(ctx, localStoreA, "sample-two", localStoreB, "sample-two", oras.DefaultCopyOptions)
	if err != nil {
		fmt.Printf("Unable to copy 'sample-two' to another OCI store: %s\n", err)
		os.Exit(1)
	}

	log.Println("Error did not reproduce")
}

The above file fails with

Unable to copy 'sample-two' to another OCI store: failed to perform "Fetch" on source: sha256:65c705d810aa0ec82d1ea886c170462c9f17203aa313bfc2cf01d099d4e5ff96: application/vnd.kitops.modelkit.model.v1.tar: not found

as the blob (digest sha256:65c705d8...) was deleted by garbage collection in the earlier Delete call.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions