From c0ffeeb759a866844d0686625acaa3252651a396 Mon Sep 17 00:00:00 2001 From: bupd Date: Mon, 6 Jan 2025 17:38:35 +0530 Subject: [PATCH 1/6] add single active replication feature Signed-off-by: bupd --- proposals/new/single-active-replication.md | 86 ++++++++++++++++++++++ 1 file changed, 86 insertions(+) create mode 100644 proposals/new/single-active-replication.md diff --git a/proposals/new/single-active-replication.md b/proposals/new/single-active-replication.md new file mode 100644 index 00000000..a5d4a849 --- /dev/null +++ b/proposals/new/single-active-replication.md @@ -0,0 +1,86 @@ +# Proposal: Single Active Replication + +Author: Prasanth Baskar/[bupd](https://github.com/bupd) + +PR: [https://github.com/goharbor/harbor/pull/21347](https://github.com/goharbor/harbor/pull/21347) + +## Abstract + +This proposal introduces a new feature that adds and option to prevent the parallel execution of replications. By adding a "Single Active Replication" checkbox in the replication policy, users can ensure that only one replication task for the same artifact is executed at a time, preventing unnecessary resource consumption, reducing bandwidth throttling, and improving replication performance. + +## Background + +In many Harbor deployments, scheduled replications of large artifacts often overlap, leading to unnecessary consumption of resources and reduced system performance. When multiple replications of the same artifact occur in parallel, especially for large images (e.g., 80 GB and beyond), it can strain network bandwidth and system queues, causing significant delays and timeouts. which each layer consisting bigger than 4 to 5GBs. + +The common use case involves scheduled replications, which may overlap during the execution of large image replications. This causes redundant transfer of the same image across multiple replication jobs, further impacting the performance and bandwidth utilization. Hence, it is important to limit replication for the same artifact to a single execution at a time to ensure more efficient resource usage. + +## Goals + +- Avoid overlapping replications of the same artifact. +- Improve resource allocation by adding an option to limit to a single replication execution per policy. +- Prevent unnecessary network and bandwidth throttling by not repeating replication jobs for the same artifact. +- Enhance performance and stability, especially for large artifacts. + +## Proposal + +A new option, **"Single Active Replication"**, will be added in the replication policy UI to ensure that replication jobs for the same artifact do not run simultaneously. The default state will be **unchecked**, meaning replication tasks can still run in parallel unless the user opts for single execution. + +When the "Single Active Replication" option is enabled, any replication task for the same artifact will not start until the current replication for that artifact finishes. This ensures that bandwidth is not overloaded and the queues are better managed. + +Additionally, the implementation will involve adding a **single_active_replication** column in the replication policy in db and updating the worker execution logic to skip replication if a task is already running. + +## Changes Made + +- Added a **"Single Active Replication"** checkbox in the replication policy UI. +- Implemented `execution skipping` logic to prevent the start of overlapping replication tasks. +- Updated the `replication policy` model to include the **single_active_replication** flag. +- Updated replication worker logic to account for the **single active replication** constraint. +- Added a new `single_active_replication` column in the database schema for the policy. + + + +## Benefits: + +- Prevents Overlapping Replication. +- Frees up bandwidth for other operations. +- Ensures efficient transfers for large artifacts. + +## Implementation + +### UI + +A **"Single Active Replication"** checkbox will be added in the replication policy UI. By default, it will be unchecked. + +### DB Schema + +Add a new column `single_active_replication` to the replication policy model: + +```go +type Policy struct { + // ... + SingleActiveReplication bool `orm:"column(single_active_replication)"` +} +``` + +SQL migration: + +```sql +ALTER TABLE replication_policy ADD COLUMN IF NOT EXISTS single_active_replication boolean; +``` + +### API + +- Create Policy: + + ```rest + POST /replication/policies + { "single_active_replication": true } + ``` + +- Update Policy: + + ```rest + PUT /replication/policies + { "single_active_replication": true } + ``` + From c9dee19f956865cc25ac89ec2c1113d42ac8265c Mon Sep 17 00:00:00 2001 From: Prasanth Baskar <89722848+bupd@users.noreply.github.com> Date: Wed, 8 Jan 2025 19:52:53 +0530 Subject: [PATCH 2/6] Update wording Signed-off-by: Prasanth Baskar <89722848+bupd@users.noreply.github.com> --- proposals/new/single-active-replication.md | 29 ++++++++++++++++++---- 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/proposals/new/single-active-replication.md b/proposals/new/single-active-replication.md index a5d4a849..f7f82f12 100644 --- a/proposals/new/single-active-replication.md +++ b/proposals/new/single-active-replication.md @@ -2,7 +2,7 @@ Author: Prasanth Baskar/[bupd](https://github.com/bupd) -PR: [https://github.com/goharbor/harbor/pull/21347](https://github.com/goharbor/harbor/pull/21347) +Discussion & PR: [https://github.com/goharbor/harbor/pull/21347](https://github.com/goharbor/harbor/pull/21347) ## Abstract @@ -10,10 +10,23 @@ This proposal introduces a new feature that adds and option to prevent the paral ## Background -In many Harbor deployments, scheduled replications of large artifacts often overlap, leading to unnecessary consumption of resources and reduced system performance. When multiple replications of the same artifact occur in parallel, especially for large images (e.g., 80 GB and beyond), it can strain network bandwidth and system queues, causing significant delays and timeouts. which each layer consisting bigger than 4 to 5GBs. +In many Harbor deployments, scheduled replications of large artifacts often overlap, leading to unnecessary consumption of resources and reduced system performance. When multiple replications of the same artifact occur in parallel, especially for large images (e.g.,512 MB and beyond), it can strain network bandwidth and system queues, causing significant delays and timeouts. The common use case involves scheduled replications, which may overlap during the execution of large image replications. This causes redundant transfer of the same image across multiple replication jobs, further impacting the performance and bandwidth utilization. Hence, it is important to limit replication for the same artifact to a single execution at a time to ensure more efficient resource usage. +## Motivation +Harbor’s current replication process runs multiple executions in parallel, copying artifact layers sequentially without coordination. This leads to redundant copying of the same layers across different executions, wasting bandwidth especially in environments with limited network speeds (e.g., 1 Mbit/s). As the number of replication executions increases, so does the exponential bandwidth consumption, which can severely degrade performance and cause replication failures. + +## User Stories +### Story 1 +As a user with limited resources, I do not want to waste bandwidth or system resources during replication, ensuring that layers are copied efficiently without unnecessary redundancy. + +### Story 2 +As a user with a 1 Mbit connection, I need to maintain two Harbor registries as identical as possible while minimizing latency and avoiding excessive resource consumption during replication. + +### Story 3 +As a user, I want to avoid replication executions getting stuck in "InProgress" status, as this prevents me from managing and deleting replication policies effectively. + ## Goals - Avoid overlapping replications of the same artifact. @@ -31,8 +44,8 @@ Additionally, the implementation will involve adding a **single_active_replicati ## Changes Made -- Added a **"Single Active Replication"** checkbox in the replication policy UI. -- Implemented `execution skipping` logic to prevent the start of overlapping replication tasks. +- Added a **"Single active replication"** checkbox in the replication policy UI. +- Implemented logic to prevent overlapping replication tasks. - Updated the `replication policy` model to include the **single_active_replication** flag. - Updated replication worker logic to account for the **single active replication** constraint. - Added a new `single_active_replication` column in the database schema for the policy. @@ -40,10 +53,10 @@ Additionally, the implementation will involve adding a **single_active_replicati ## Benefits: - - Prevents Overlapping Replication. - Frees up bandwidth for other operations. - Ensures efficient transfers for large artifacts. +- Ensures no bandwidth is wasted. ## Implementation @@ -51,6 +64,12 @@ Additionally, the implementation will involve adding a **single_active_replicati A **"Single Active Replication"** checkbox will be added in the replication policy UI. By default, it will be unchecked. +![image](https://github.com/user-attachments/assets/a6d10236-577b-4249-9763-7b8584c2a426) + +![Screenshot_2025-01-08_18-44-27](https://github.com/user-attachments/assets/3dcc0d84-68dc-49a1-a51d-b578189cb244) + + + ### DB Schema Add a new column `single_active_replication` to the replication policy model: From c04ba9fb8f447ab80a0a15fe0bdf1e278117860d Mon Sep 17 00:00:00 2001 From: Prasanth Baskar Date: Wed, 11 Jun 2025 15:40:56 +0530 Subject: [PATCH 3/6] update & add more technical details Signed-off-by: Prasanth Baskar --- proposals/new/single-active-replication.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/proposals/new/single-active-replication.md b/proposals/new/single-active-replication.md index f7f82f12..233ee054 100644 --- a/proposals/new/single-active-replication.md +++ b/proposals/new/single-active-replication.md @@ -45,9 +45,10 @@ Additionally, the implementation will involve adding a **single_active_replicati ## Changes Made - Added a **"Single active replication"** checkbox in the replication policy UI. -- Implemented logic to prevent overlapping replication tasks. +- Implemented a best-effort check to avoid concurrent executions of the same replication policy by inspecting ongoing replication tasks. +> Note: No locking is enforced, there is no lock or unlock logic in the database, Core, or Jobservice. +- if any previous replications for the same policy are still running. the new execution is skipped. thereby enforcing single active replication. - Updated the `replication policy` model to include the **single_active_replication** flag. -- Updated replication worker logic to account for the **single active replication** constraint. - Added a new `single_active_replication` column in the database schema for the policy. From 38ce450e671f8907146e87e7edad54b1435eb5f4 Mon Sep 17 00:00:00 2001 From: Prasanth Baskar Date: Wed, 11 Jun 2025 17:42:22 +0530 Subject: [PATCH 4/6] update wording Signed-off-by: Prasanth Baskar --- proposals/new/single-active-replication.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/new/single-active-replication.md b/proposals/new/single-active-replication.md index 233ee054..b1dd5a64 100644 --- a/proposals/new/single-active-replication.md +++ b/proposals/new/single-active-replication.md @@ -38,9 +38,9 @@ As a user, I want to avoid replication executions getting stuck in "InProgress" A new option, **"Single Active Replication"**, will be added in the replication policy UI to ensure that replication jobs for the same artifact do not run simultaneously. The default state will be **unchecked**, meaning replication tasks can still run in parallel unless the user opts for single execution. -When the "Single Active Replication" option is enabled, any replication task for the same artifact will not start until the current replication for that artifact finishes. This ensures that bandwidth is not overloaded and the queues are better managed. +When the "Single Active Replication" option is enabled, any replication task for the same replication policy will not start until the current replication for that policy finishes. This ensures that bandwidth is not overloaded. -Additionally, the implementation will involve adding a **single_active_replication** column in the replication policy in db and updating the worker execution logic to skip replication if a task is already running. +Additionally, the implementation will involve adding a **single_active_replication** column in the replication policy in db and updating the worker execution logic to skip replication if a task is already running for the same policy. ## Changes Made From 776bd7e54e1e8658ca069864e301314a82e54af5 Mon Sep 17 00:00:00 2001 From: Prasanth Baskar Date: Wed, 11 Jun 2025 17:49:31 +0530 Subject: [PATCH 5/6] add out of scope Signed-off-by: Prasanth Baskar --- proposals/new/single-active-replication.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/proposals/new/single-active-replication.md b/proposals/new/single-active-replication.md index b1dd5a64..e80013d2 100644 --- a/proposals/new/single-active-replication.md +++ b/proposals/new/single-active-replication.md @@ -51,7 +51,11 @@ Additionally, the implementation will involve adding a **single_active_replicati - Updated the `replication policy` model to include the **single_active_replication** flag. - Added a new `single_active_replication` column in the database schema for the policy. - +## Out of Scope +This proposal does not address +- The same artifact being replicated simultaneously by different replication policies. +- There is no per-artifact locking or de-duplication mechanism. +- Tasks already running before this will not be interrupted. ## Benefits: - Prevents Overlapping Replication. From 4b9e23aa05bff96686da6bb23b488a9dba906b9b Mon Sep 17 00:00:00 2001 From: Prasanth Baskar Date: Thu, 26 Jun 2025 22:34:35 +0530 Subject: [PATCH 6/6] improve wording in proposal Signed-off-by: Prasanth Baskar --- proposals/new/single-active-replication.md | 71 ++++++++++++---------- 1 file changed, 40 insertions(+), 31 deletions(-) diff --git a/proposals/new/single-active-replication.md b/proposals/new/single-active-replication.md index e80013d2..25c5a54d 100644 --- a/proposals/new/single-active-replication.md +++ b/proposals/new/single-active-replication.md @@ -6,41 +6,49 @@ Discussion & PR: [https://github.com/goharbor/harbor/pull/21347](https://github. ## Abstract -This proposal introduces a new feature that adds and option to prevent the parallel execution of replications. By adding a "Single Active Replication" checkbox in the replication policy, users can ensure that only one replication task for the same artifact is executed at a time, preventing unnecessary resource consumption, reducing bandwidth throttling, and improving replication performance. +This proposal introduces a new feature that adds and option to prevent the parallel execution of replications in a given replication policy. By adding a "Single Active Replication" checkbox in the replication policy, users can ensure that only one replication task for the same replication policy is executed at a time, preventing unnecessary resource consumption, reducing bandwidth throttling, and improving overall replication performance. -## Background +## Problem -In many Harbor deployments, scheduled replications of large artifacts often overlap, leading to unnecessary consumption of resources and reduced system performance. When multiple replications of the same artifact occur in parallel, especially for large images (e.g.,512 MB and beyond), it can strain network bandwidth and system queues, causing significant delays and timeouts. - -The common use case involves scheduled replications, which may overlap during the execution of large image replications. This causes redundant transfer of the same image across multiple replication jobs, further impacting the performance and bandwidth utilization. Hence, it is important to limit replication for the same artifact to a single execution at a time to ensure more efficient resource usage. - -## Motivation -Harbor’s current replication process runs multiple executions in parallel, copying artifact layers sequentially without coordination. This leads to redundant copying of the same layers across different executions, wasting bandwidth especially in environments with limited network speeds (e.g., 1 Mbit/s). As the number of replication executions increases, so does the exponential bandwidth consumption, which can severely degrade performance and cause replication failures. +Currently, Harbor allows multiple replication executions of the same policy to run in parallel. When policies are scheduled frequently and involve large artifacts (e.g., 1 GB+), overlapping executions may: +- Copy the same big artifact layers +- Consume unnecessary bandwidth and IOPS +- Cause throttling or failures in low-bandwidth environments +- Result in exponential performance degradation +This issue is exacerbated when artifact sizes increase and scheduling intervals are frequent. ## User Stories ### Story 1 -As a user with limited resources, I do not want to waste bandwidth or system resources during replication, ensuring that layers are copied efficiently without unnecessary redundancy. +As a user with Harbor deployed across multiple regions, I rely on scheduled replication policies to keep them synchronized. The constant problem I face is that these scheduled replications often overlap, leading to the same policy being run simultaneously. I desperately need replications for a given policy to run sequentially, not concurrently. ### Story 2 -As a user with a 1 Mbit connection, I need to maintain two Harbor registries as identical as possible while minimizing latency and avoiding excessive resource consumption during replication. +As a user with a 1 Mbit connection, I need to maintain two Harbor registries as identical as possible. I need to ensure my scheduled replication policies don't overwhelm my network. When large artifacts are involved, current concurrent replication attempts for the same policy lead to severe latency and excessive resource consumption due to replication jobs for the same policy piling up. -### Story 3 -As a user, I want to avoid replication executions getting stuck in "InProgress" status, as this prevents me from managing and deleting replication policies effectively. ## Goals +- Add an option to avoid overlapping replications in the same replication policy. +- Use the option to limit to a single active replication execution per policy. +- Prevent unnecessary network and bandwidth throttling by not repeating replication jobs for the same replication policy. -- Avoid overlapping replications of the same artifact. -- Improve resource allocation by adding an option to limit to a single replication execution per policy. -- Prevent unnecessary network and bandwidth throttling by not repeating replication jobs for the same artifact. -- Enhance performance and stability, especially for large artifacts. +## Non Goals +This proposal does not address +- The same artifact being replicated simultaneously by different replication policies. +- There is no per-artifact locking or de-duplication mechanism. +- Tasks already running before this will not be interrupted. ## Proposal -A new option, **"Single Active Replication"**, will be added in the replication policy UI to ensure that replication jobs for the same artifact do not run simultaneously. The default state will be **unchecked**, meaning replication tasks can still run in parallel unless the user opts for single execution. +A new option, **"Single Active Replication"**, will be added in the replication policy UI to ensure that replication jobs for the same replication policy do not run simultaneously. The default state will be **unchecked**, meaning replication tasks can still run in parallel unless the user opts for single execution. When the "Single Active Replication" option is enabled, any replication task for the same replication policy will not start until the current replication for that policy finishes. This ensures that bandwidth is not overloaded. -Additionally, the implementation will involve adding a **single_active_replication** column in the replication policy in db and updating the worker execution logic to skip replication if a task is already running for the same policy. + +## Benefits: +- Prevents Overlapping Replication. +- Frees up bandwidth for other operations. +- Ensures efficient transfers for large artifacts. +- Ensures no bandwidth is wasted. + ## Changes Made @@ -51,17 +59,6 @@ Additionally, the implementation will involve adding a **single_active_replicati - Updated the `replication policy` model to include the **single_active_replication** flag. - Added a new `single_active_replication` column in the database schema for the policy. -## Out of Scope -This proposal does not address -- The same artifact being replicated simultaneously by different replication policies. -- There is no per-artifact locking or de-duplication mechanism. -- Tasks already running before this will not be interrupted. - -## Benefits: -- Prevents Overlapping Replication. -- Frees up bandwidth for other operations. -- Ensures efficient transfers for large artifacts. -- Ensures no bandwidth is wasted. ## Implementation @@ -73,6 +70,10 @@ A **"Single Active Replication"** checkbox will be added in the replication poli ![Screenshot_2025-01-08_18-44-27](https://github.com/user-attachments/assets/3dcc0d84-68dc-49a1-a51d-b578189cb244) +### Harbor Core +A new condition is added in the replication flow checking if single_active_replication is enabled for the replication policy. if a replication policy has single_active_replication enabled, the system first checks the database for any currently running executions of that policy. +- If a execution with running status is found, a new replication job is created but **marked as "skipped"** instead of replication being executed normally. +- If no active executions are found, the system proceeds with the normal replication flow, initiating the new replication job as usual. ### DB Schema @@ -98,13 +99,21 @@ ALTER TABLE replication_policy ADD COLUMN IF NOT EXISTS single_active_replicatio ```rest POST /replication/policies - { "single_active_replication": true } + { + "single_active_replication": true + ...other fields + } ``` - Update Policy: ```rest PUT /replication/policies - { "single_active_replication": true } + { + "id": 1, + "name": "shceduled-replication", + "single_active_replication": true + ...other fields + } ```