Skip to content

Failure of the image rescan when there is a misconfigured container registry row #3650

Open
@jopemachine

Description

Summary  

  • If there is a misconfigured container registry row, the whole image rescan task fails.

Steps to Reproduce

  1. Populate an arbitrary misconfigured container registry row.
  2. Run image rescan command (by GQL or CLI).

Expected Behavior  

  • The misconfigured container registry should be skipped, and the image rescan should proceed for the remaining records with appropriate logs recorded in the manager

Actual Behavior  

  • The entire image rescanning task fails.

Logs/Errors

For example, if a container registry row has an arbitrary, non-existent project value, running the image rescan results in an error like the following and fails.

❯ ./backend.ai mgr image rescan cr.backend.ai
2025-02-11 06:28:37.648 INFO ai.backend.manager.models.image [410803] Scanning kernel images from the registry "cr.backend.ai"
2025-02-11 06:28:37.649 INFO ai.backend.manager.models.image [410803] Scanning kernel images from the registry "cr.backend.ai"
2025-02-11 06:28:37.649 INFO ai.backend.manager.models.image [410803] Scanning kernel images from the registry "cr.backend.ai"
2025-02-11 06:28:37.649 INFO ai.backend.manager.models.image [410803] Scanning kernel images from the registry "cr.backend.ai"
2025-02-11 06:28:37.649 INFO ai.backend.manager.container_registry.base [410803] rescan_single_registry()
2025-02-11 06:28:37.650 INFO ai.backend.manager.container_registry.base [410803] rescan_single_registry()
2025-02-11 06:28:37.650 INFO ai.backend.manager.container_registry.base [410803] rescan_single_registry()
2025-02-11 06:28:37.650 INFO ai.backend.manager.container_registry.base [410803] rescan_single_registry()
2025-02-11 06:28:37.729 ERROR ai.backend.manager.cli.image_impl [410803] An error occurred.
  + Exception Group Traceback (most recent call last):
  |   File "/home/jopemachine/.local/backend.ai/repos/feat_add_status_column_to_image/src/ai/backend/manager/cli/image_impl.py", line 186, in rescan_images
  |     await rescan_images_func(db, registry_or_image, project)
  |   File "/home/jopemachine/.local/backend.ai/repos/feat_add_status_column_to_image/src/ai/backend/manager/models/image.py", line 280, in rescan_images
  |     await scan_registries(db, matching_registries, reporter=reporter)
  |   File "/home/jopemachine/.local/backend.ai/repos/feat_add_status_column_to_image/src/ai/backend/manager/models/image.py", line 175, in scan_registries
  |     async with aiotools.TaskGroup() as tg:
  |                ^^^^^^^^^^^^^^^^^^^^
  |   File "/home/jopemachine/.local/backend.ai/repos/feat_add_status_column_to_image/dist/export/python/virtualenvs/python-default/3.12.8/lib/python3.12/site-packages/aiotools/taskgroup/base.py", line 39, in __aexit__
  |     raise TaskGroupError(eg.message, eg.exceptions) from None
  | aiotools.taskgroup.types.TaskGroupError: unhandled errors in a TaskGroup (1 sub-exception)
  +-+---------------- 1 ----------------
    | Exception Group Traceback (most recent call last):
    |   File "/home/jopemachine/.local/backend.ai/repos/feat_add_status_column_to_image/src/ai/backend/manager/container_registry/base.py", line 117, in rescan_single_registry
    |     async with aiotools.TaskGroup() as tg:
    |                ^^^^^^^^^^^^^^^^^^^^
    |   File "/home/jopemachine/.local/backend.ai/repos/feat_add_status_column_to_image/dist/export/python/virtualenvs/python-default/3.12.8/lib/python3.12/site-packages/aiotools/taskgroup/base.py", line 39, in __aexit__
    |     raise TaskGroupError(eg.message, eg.exceptions) from None
    | aiotools.taskgroup.types.TaskGroupError: unhandled errors in a TaskGroup (1 sub-exception)
    +-+---------------- 1 ----------------
      | Traceback (most recent call last):
      |   File "/home/jopemachine/.local/backend.ai/repos/feat_add_status_column_to_image/src/ai/backend/manager/container_registry/base.py", line 118, in rescan_single_registry
      |     async for image in self.fetch_repositories(client_session):
      |   File "/home/jopemachine/.local/backend.ai/repos/feat_add_status_column_to_image/src/ai/backend/manager/container_registry/harbor.py", line 210, in fetch_repositories
      |     raise RuntimeError(
      | RuntimeError: ('failed to fetch repositories in project weipefjeif', 'UNAUTHORIZED', 'unauthorized')

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions