Skip to content

Conversation

@GitHK
Copy link
Contributor

@GitHK GitHK commented Nov 19, 2025

What do these changes do?

If a task is marked for removal and there is no entry in Redis for more than 1 minute, then it will be removed form Redis.

Why is this necessary? If an instance where and asyncio task was running is rebooted, the entry will remain in Redis. This ensures cleanup.

Related issue/s

How to test

Dev-ops 🚨

When releasing this to any deployment, do the following:

  • Go to Redis-Commander
  • Open the redis console on DB number 6
  • Run below script (which removes all the keys in that database)
EVAL "for _,k in ipairs(redis.call('KEYS','*')) do redis.call('DEL',k) end" 0

@GitHK GitHK self-assigned this Nov 19, 2025
@GitHK GitHK added bug buggy, it does not work as expected t:maintenance Some planned maintenance work and removed bug buggy, it does not work as expected labels Nov 19, 2025
@GitHK GitHK changed the title Allow for cleanup of Redis is task is not present 🐛 Allow for cleanup of Redis is task is not present Nov 19, 2025
@GitHK GitHK added this to the Imparable milestone Nov 19, 2025
@codecov
Copy link

codecov bot commented Nov 19, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.43%. Comparing base (daf0b87) to head (12abbec).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8620      +/-   ##
==========================================
+ Coverage   87.51%   89.43%   +1.91%     
==========================================
  Files        2010     1368     -642     
  Lines       78979    58011   -20968     
  Branches     1378      154    -1224     
==========================================
- Hits        69122    51884   -17238     
+ Misses       9453     6077    -3376     
+ Partials      404       50     -354     
Flag Coverage Δ *Carryforward flag
integrationtests 63.95% <ø> (ø) Carriedforward from daf0b87
unittests 87.84% <ø> (+1.49%) ⬆️

*This pull request uses carry forward flags. Click here to find out more.

Components Coverage Δ
pkg_aws_library ∅ <ø> (∅)
pkg_celery_library ∅ <ø> (∅)
pkg_dask_task_models_library ∅ <ø> (∅)
pkg_models_library ∅ <ø> (∅)
pkg_notifications_library ∅ <ø> (∅)
pkg_postgres_database ∅ <ø> (∅)
pkg_service_integration ∅ <ø> (∅)
pkg_service_library ∅ <ø> (∅)
pkg_settings_library ∅ <ø> (∅)
pkg_simcore_sdk 84.58% <ø> (ø)
agent 93.44% <ø> (ø)
api_server 91.37% <ø> (ø)
autoscaling 95.83% <ø> (ø)
catalog 92.06% <ø> (ø)
clusters_keeper 99.14% <ø> (ø)
dask_sidecar 91.72% <ø> (ø)
datcore_adapter 97.95% <ø> (ø)
director 75.72% <ø> (ø)
director_v2 91.29% <ø> (ø)
dynamic_scheduler 96.66% <ø> (ø)
dynamic_sidecar 90.83% <ø> (ø)
efs_guardian 89.83% <ø> (ø)
invitations 90.90% <ø> (ø)
payments 92.70% <ø> (ø)
resource_usage_tracker 92.00% <ø> (ø)
storage 86.70% <ø> (+0.24%) ⬆️
webclient ∅ <ø> (∅)
webserver 86.67% <ø> (+0.01%) ⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update daf0b87...12abbec. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mergify
Copy link
Contributor

mergify bot commented Nov 19, 2025

🧪 CI Insights

Here's what we observed from your CI run for 12abbec.

❌ Job Failures

Pipeline Job Health on master Retries 🔍 CI Insights 📄 Logs
CI unit-tests Broken 0 View View

✅ Passed Jobs With Interesting Signals

Pipeline Job Signal Health on master Retries 🔍 CI Insights 📄 Logs
CI integration-tests Base branch is broken, but the job passed. Looks like this might be a real fix 💪 Broken 0 View View

@GitHK GitHK changed the title 🐛 Allow for cleanup of Redis is task is not present 🐛 Allow for cleanup of Redis when asyncio task is not present Nov 19, 2025
@sonarqubecloud
Copy link

@GitHK GitHK changed the title 🐛 Allow for cleanup of Redis when asyncio task is not present 🐛 Allow for cleanup of Redis when asyncio task is not present 🚨 Nov 19, 2025
@GitHK GitHK marked this pull request as ready for review November 19, 2025 16:01
@GitHK GitHK requested a review from pcrespov as a code owner November 19, 2025 16:01
Copy link
Member

@sanderegg sanderegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: will this work with multiple replicas of the service as well? I guess different replicas might have different tasks in their process

mapping=_to_redis_hash_mapping(
{
_MARKED_FOR_REMOVAL_FIELD: True,
_MARKED_FOR_REMOVAL_AT_FIELD: datetime.datetime.now(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: why not have an optional _MARKED_FOR_REMOVAL_AT_FIELD field? if it is null it means there are no such thing. If it exists it means you want to remove it.. no need for the additional boolean.

"the entry form Redis"
)
),
] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I do not see the point of having 2 variables here.


task_to_cancel = self._created_tasks.pop(task_id, None)
if task_to_cancel is not None:
_logger.debug("Removing asyncio task related to task_id='%s'", task_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tip: use with log_context

await self._tasks_data.delete_task_data(task_id)
else:
task_data = await self._tasks_data.get_task_data(task_id)
if task_data.marked_for_removal_at is not None and datetime.datetime.now( # type: ignore[union-attr]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see? this would simplify these if case... actually the boolean is useless here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes i tend to agree

task_data = await self._tasks_data.get_task_data(task_id)
if task_data.marked_for_removal_at is not None and datetime.datetime.now( # type: ignore[union-attr]
tz=datetime.UTC
) - task_data.marked_for_removal_at > datetime.timedelta( # type: ignore[union-attr]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use assert task_data.marked_for_removal_at # nosec instead of disabling pylint

) - task_data.marked_for_removal_at > datetime.timedelta( # type: ignore[union-attr]
seconds=_TASK_REMOVAL_MAX_WAIT
):
_logger.debug(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with log_context here would also tell you that the function finished.

Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx. Left some comments for your consideration

BTW is wiping out the entire Redis database 6 necessary for this change?

bool,
Field(description=("if True, indicates the task is marked for removal")),
] = False
marked_for_removal_at: Annotated[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Why is the combination marked_for_removal_at=False and marked_for_removal_at!=None allowed? What does it mean?
    or marked_for_removal_at=True and marked_for_removal_at=None ? Does this mena it can be removed anytime? but there is no gurantee of when?

await self._tasks_data.delete_task_data(task_id)
else:
task_data = await self._tasks_data.get_task_data(task_id)
if task_data.marked_for_removal_at is not None and datetime.datetime.now( # type: ignore[union-attr]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes i tend to agree

TEST_CHECK_STALE_INTERVAL_S: Final[float] = 1


def strip_markd_for_removal_at(task_data: TaskData) -> TaskData:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. typo: marked
  2. stick to a small set of verbs e.g. CRUD or similar ones. Moreover "strip" is used in strings which has nothing to do with this.
  3. Here could be clear_marked_for_removal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t:maintenance Some planned maintenance work

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants