Skip to content

Conversation

@NamelessOIer
Copy link
Collaborator

@NamelessOIer NamelessOIer commented Dec 9, 2025

#172, reimplement of #308

Summary by CodeRabbit

  • New Features

    • Jobs can declare dependencies (after/afterok/afternotok/afterany) with AND/OR logic; per-job dependency status and readiness are exposed; scheduler enqueues and emits dependency events; submission now reports per-job success/error.
  • Bug Fixes

    • Better handling of missing dependees, DB-backed job status lookups, improved lifecycle-triggering of dependency events, and more robust error propagation in task submission paths.
  • Documentation

    • Added Job Dependency and Pending Reason reference pages and CLI docs for -d/--dependency.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 9, 2025

📝 Walkthrough

Walkthrough

Adds end-to-end job-dependency support: protobuf additions and error codes, per-task dependency state and dependents, scheduler wiring (DB lookup, dependency-event queue, lifecycle-triggered events), submission futures/promises now carry CraneExpected, and CLI/docs for -d/--dependency.

Changes

Cohort / File(s) Summary
Protobuf definitions
protos/PublicDefs.proto
Added DependencyType, DependencyCondition, Dependencies, DependencyStatus; added dependencies to TaskToCtld, dependency_status to TaskInfo; added ERR_MISSING_DEPENDENCY and ERR_DB_INSERT_FAILED to ErrCode.
Ctld public defs & runtime
src/CraneCtld/CtldPublicDefs.h, src/CraneCtld/CtldPublicDefs.cpp
Introduced DependenciesInJob struct and per-task dependency state; added SetDependency, UpdateDependency, AddDependent, TriggerDependencyEvents, Dependencies() on TaskInCtld; track dependents per DependencyType; compute ready_time/ready status and serialize dependency_status into TaskInfo.
Scheduler & submission flow
src/CraneCtld/TaskScheduler.h, src/CraneCtld/TaskScheduler.cpp
Built dependee→dependents mappings at init; DB query for missing dependees and enqueue dependency events; evaluate Dependencies().is_met() before scheduling; lifecycle triggers for AFTER/AFTER_OK/AFTER_NOT_OK/AFTER_ANY; added DependencyEvent queue and AddDependencyEvent; changed submit queue, promises, and public signatures to use CraneExpected<task_id_t>-wrapped futures.
DB client
src/CraneCtld/DbClient.h, src/CraneCtld/DbClient.cpp
Added MongodbClient::FetchJobStatus to fetch multiple jobs' status/exit_code/start/end times; uses projection and exception-logged error handling.
gRPC server submission handling
src/CraneCtld/RpcService/CtldGrpcServer.cpp
Unwrapped nested CraneExpected results for single and batch submissions/streams; propagate inner errors via CraneErrStr(...); use 0 placeholder for failed batch items.
CLI docs & references
docs/**/crun.md, docs/**/calloc.md, docs/**/cbatch.md, docs/**/reference/job_dependency.md, docs/**/reference/pending_reason.md, mkdocs.yaml, docs/zh/**
Added -d, --dependency CLI option docs, Job Dependency and Pending Reason reference pages, i18n entries and examples; updated navigation.

Sequence Diagram(s)

sequenceDiagram
    participant Client as gRPC Client
    participant GrpcServer as Ctld gRPC Server
    participant Scheduler as TaskScheduler
    participant DB as MongoDB
    participant Queue as DependencyEventQueue
    participant Task as TaskInCtld

    Client->>GrpcServer: Submit task (with `dependencies`)
    GrpcServer->>Scheduler: SubmitTaskToScheduler(task)
    Scheduler->>Task: SetDependency(grpc_deps)
    Task-->>Scheduler: dependency state initialized

    Scheduler->>Scheduler: build dependee→dependents map
    alt missing dependees found
        Scheduler->>DB: FetchJobStatus(dependee_ids)
        DB-->>Scheduler: statuses & timestamps
        Scheduler->>Queue: Enqueue computed dependency events
    end

    Scheduler->>Scheduler: Check Dependencies().is_met(now)
    alt deps met
        Scheduler->>Scheduler: schedule task (enqueue/run)
    else deps not met
        Scheduler-->>Task: remain pending (PendingReason: Dependency / DependencyNeverSatisfied)
    end

    Note over Task,Scheduler: lifecycle events trigger dependency events
    Task->>Scheduler: start/finish/cancel (timestamp & status)
    Scheduler->>Queue: Enqueue AFTER/AFTER_OK/AFTER_NOT_OK/AFTER_ANY events
    Queue->>Task: UpdateDependency(dependee_id, event_time)
    Task-->>Scheduler: dependency updated → may transition to schedulable
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

Suggested labels

Doc

Suggested reviewers

  • L-Xiafeng
  • RileyWen

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 2.63% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: dependency' accurately describes the main change: implementing job dependency functionality. It is concise and specific, covering the primary feature addition across proto definitions, scheduler logic, CLI options, and documentation.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dev/dependency_re

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 15b94a8 and 11d8672.

📒 Files selected for processing (1)
  • src/CraneCtld/CtldPublicDefs.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/CraneCtld/CtldPublicDefs.cpp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
src/CraneCtld/RpcService/CtldGrpcServer.cpp (1)

409-421: Batch submit now cleanly propagates both sync and async errors; verify reply layout

The single‑task and batch submit paths now both unwrap CraneExpected<std::future<CraneExpected<task_id_t>>> and:

  • on inner success: set ok=true/append real task_id,
  • on inner failure: set ok=false/append task_id=0 and task_result.error(),
  • on outer failure: set ok=false/append task_id=0 and result.error().

This gives per‑task visibility into both submission‑time and scheduling‑time failures, which matches the new dependency/DB error codes.

One thing to double‑check on your side: clients’ expectations about SubmitBatchTasksReply’s task_id_list and code_list:

  • Current logic appends a code only for failed submissions, so task_id_list.size() may be larger than code_list.size() when there is a mix of success and failure.
  • If the proto/clients treat these lists as parallel arrays by index, you may instead want to always append a code (e.g. SUCCESS for successful tasks) to keep them aligned.

If the existing contract is “codes only for failed entries”, then this implementation is already consistent.

Also applies to: 443-469

src/CraneCtld/CtldPublicDefs.h (1)

414-430: Dependency tracking model in TaskInCtld/DependenciesInJob is coherent with the proto semantics

The new dependency machinery hangs together well:

  • DependenciesInJob::is_met() and is_failed() combined with update() (min/max over event_time + delay) correctly implement:
    • AND semantics (all deps must fire; any failure makes the job permanently unschedulable),
    • OR semantics (first satisfying dep wins; only “all failed” yields a failure state).
  • Default state (deps empty, is_or=false, ready_time=InfinitePast) makes tasks without dependencies immediately eligible.
  • The dependents[DependencyType_ARRAYSIZE] fan‑out plus AddDependent / TriggerDependencyEvents keep the “who depends on me” relation local to the producer job and delegate actual wake‑ups through g_task_scheduler->AddDependencyEvent(), which is a clean separation.
  • The special‑case in AddDependent for AFTER when the job is already Running (immediately triggering with start_time) matches “after job start” semantics and avoids storing unnecessary dependents.

As a tiny readability nit, you could make is_failed() compare ready_time == absl::InfiniteFuture() instead of >= to make the intent explicit, but functionally it’s equivalent.

Also applies to: 735-736, 748-748, 923-928

src/CraneCtld/CtldPublicDefs.cpp (2)

989-1004: Potential overwriting of multiple dependency conditions per job-id

Two small points to double‑check:

  1. In SetDependency, you index dependencies.deps only by dep.job_id():
dependencies.deps[dep.job_id()] = {dep.type(), dep.delay_seconds()};

If the proto ever allows multiple dependency conditions on the same job ID (e.g., different DependencyType entries for the same dependee), later entries silently overwrite earlier ones. If that’s not intended, consider either:

  • Rejecting such input explicitly, or
  • Extending the internal structure to support multiple conditions per dependee.
  1. SetDependency doesn’t clear dependencies.deps before repopulating. Today it’s only called on fresh or reconstructed TaskInCtld instances, but if it’s ever reused, stale entries could leak. A defensive clear would make the API safer:
void TaskInCtld::SetDependency(const crane::grpc::Dependencies& grpc_deps) {
-  if (grpc_deps.is_or()) {
+  dependencies.deps.clear();
+
+  if (grpc_deps.is_or()) {
     dependencies.is_or = true;
     dependencies.ready_time = absl::InfiniteFuture();
   } else {
     dependencies.is_or = false;
     dependencies.ready_time = absl::InfinitePast();
   }
   ...
}

Also, SetFieldsOfTaskInfo emits dependency_status whenever deps is non‑empty or ready_time != absl::InfinitePast(). This assumes the default ready_time is exactly InfinitePast; otherwise tasks without dependencies might still show a dependency block. Worth ensuring the default matches that assumption.

Also applies to: 1178-1199


1006-1022: AddDependent/TriggerDependencyEvents semantics and lifecycle

The overall design (dependents stored on the dependee; events fanned out via TriggerDependencyEvents into TaskScheduler::AddDependencyEvent) is sound, and the “already running AFTER” fast‑path using start_time correctly preserves delay semantics.

Two minor observations:

  • The comment typo “aready satisfied” in AddDependent should be fixed when you next touch this code.
  • TriggerDependencyEvents never clears dependents[dep_type]. Today each (dep_type, dependee) is only triggered once, so it’s harmless, but if future changes made events fire multiple times per type, DependenciesInJob::update would start logging spurious “Dependency for job not found” errors. Clearing after use would be safer:
void TaskInCtld::TriggerDependencyEvents(
    const crane::grpc::DependencyType& dep_type, absl::Time event_time) {
-  for (task_id_t dependent_id : dependents[dep_type]) {
+  auto& vec = dependents[dep_type];
+  for (task_id_t dependent_id : vec) {
     g_task_scheduler->AddDependencyEvent(dependent_id, task_id, event_time);
   }
+  vec.clear();
}

Not urgent, but would reduce risk of noisy logs if the triggering pattern evolves.

src/CraneCtld/TaskScheduler.cpp (2)

399-487: Recovery-time dependency reconstruction and DB backfill

The recovery block that:

  • Builds dependee_to_dependents from m_pending_task_map_,
  • Hooks dependents to in-memory pending/running jobs via AddDependent, and
  • For missing dependees, queries MongoDB and enqueues DependencyEvents with appropriate timestamps (or InfiniteFuture),

is a solid way to restore dependency state after restart.

A couple of behavioral notes to be aware of:

  • For AFTER dependencies you fall back to cancel time when time_start == 0. That effectively treats a job cancelled before it ever ran as satisfying AFTER. If your intent is “AFTER job leaves the queue” rather than “AFTER job started”, this is correct; otherwise it may surprise users.
  • For AFTER_ANY, when time_end == 0 you use absl::Now() and log an error. That’s a reasonable best-effort fallback, but the error log will be noisy if this can happen in normal operation.

If the semantics above are intentional, the implementation looks good; otherwise it might be worth tightening the conditions or distinguishing “never started” from “ended”.


1521-1587: End-to-end CraneExpected submission flow is reasonable

SubmitTaskToScheduler now:

  • Performs all synchronous checks (UID validity, account/partition permission, QoS enablement, partition allowlist, optional fields, QoS limits) and returns std::unexpected(err) on failure.
  • On success, calls TryMallocQosResource, logs and returns std::unexpected(res) if the QoS pool is exhausted.
  • Only then calls g_task_scheduler->SubmitTaskAsync(std::move(task)) and returns the resulting std::future<CraneExpected<task_id_t>>.

This separation of sync vs async errors is clear. One minor stylistic nit: since you’re inside TaskScheduler, calling SubmitTaskAsync(std::move(task)) directly would avoid the global g_task_scheduler indirection, but functionally it’s equivalent.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 60d9003 and c38d42e.

📒 Files selected for processing (8)
  • protos/PublicDefs.proto (4 hunks)
  • src/CraneCtld/CtldPublicDefs.cpp (5 hunks)
  • src/CraneCtld/CtldPublicDefs.h (4 hunks)
  • src/CraneCtld/DbClient.cpp (1 hunks)
  • src/CraneCtld/DbClient.h (1 hunks)
  • src/CraneCtld/RpcService/CtldGrpcServer.cpp (4 hunks)
  • src/CraneCtld/TaskScheduler.cpp (13 hunks)
  • src/CraneCtld/TaskScheduler.h (5 hunks)
🧰 Additional context used
🧠 Learnings (17)
📓 Common learnings
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 625
File: src/CraneCtld/TaskScheduler.cpp:2811-2874
Timestamp: 2025-09-22T07:13:20.635Z
Learning: In the CraneSched codebase, there are latch mechanisms used to ensure variable lifetime safety when capturing references in detached thread pool tasks, allowing reference captures that might otherwise be unsafe.
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/TaskScheduler.cpp:1713-1721
Timestamp: 2025-05-09T02:15:30.422Z
Learning: TaskScheduler中的资源回收设计:当Craned节点离线时,通过TaskStatusChangeAsync异步处理任务状态变更和资源回收。尽管可能存在短暂的资源更新延迟,但这是设计上可接受的权衡。
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/TaskManager.cpp:1162-1175
Timestamp: 2025-05-08T07:39:29.207Z
Learning: In CraneSched, when a task fails in the LaunchTaskInstanceMt_ method, ActivateTaskStatusChangeAsync_ does not automatically release PMIx resources or cgroup. Explicit cleanup is required by calling g_pmix_server->DeregisterTask() and g_cg_mgr->ReleaseCgroupByTaskIdOnly().
📚 Learning: 2025-05-09T01:54:21.256Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/CranedKeeper.cpp:51-53
Timestamp: 2025-05-09T01:54:21.256Z
Learning: The ConfigureCraned function in src/CraneCtld/CranedKeeper.cpp is called from a thread pool, so there's no need to worry about it blocking the gRPC completion queue thread.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/TaskScheduler.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-02T07:06:36.103Z
Learnt from: RileyWen
Repo: PKUHPC/CraneSched PR: 475
File: src/CraneCtld/CtldGrpcServer.cpp:113-118
Timestamp: 2025-05-02T07:06:36.103Z
Learning: In CraneSched, gRPC methods should generally return Status::OK even when handling error conditions, as non-OK statuses cause the connection to terminate. Error information should be communicated within the RPC response payload instead.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
📚 Learning: 2025-06-23T07:53:30.513Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 520
File: src/Craned/CranedServer.cpp:50-51
Timestamp: 2025-06-23T07:53:30.513Z
Learning: In the CraneSched codebase, `g_ctld_client` is a guaranteed global variable that is always initialized before any gRPC service methods are called, so null pointer checks are not necessary when calling methods on it.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
📚 Learning: 2025-08-12T08:58:39.772Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 577
File: src/Craned/Supervisor/TaskManager.cpp:124-130
Timestamp: 2025-08-12T08:58:39.772Z
Learning: In the CraneSched project using C++23, Class Template Argument Deduction (CTAD) allows std::unique_ptr declarations without explicit template parameters when the type can be deduced from the initializer, such as `std::unique_ptr task = std::move(m_task_map_.at(task_id))` where the template parameter is deduced from the move operation.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/CtldPublicDefs.h
  • src/CraneCtld/TaskScheduler.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-26T11:04:30.580Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 399
File: src/Craned/Craned/SupervisorKeeper.h:32-32
Timestamp: 2025-05-26T11:04:30.580Z
Learning: The Supervisor component in the Crane system is designed to manage only one task per instance. The task specification is provided to the Supervisor during startup by Craned (likely through InitSupervisorRequest), so the ExecuteTask() method doesn't need to accept task parameters since the Supervisor already knows which task to execute.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/TaskScheduler.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-08T07:39:29.207Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/TaskManager.cpp:1162-1175
Timestamp: 2025-05-08T07:39:29.207Z
Learning: In CraneSched, when a task fails in the LaunchTaskInstanceMt_ method after PMIx registration, ActivateTaskStatusChangeAsync_ does not automatically release PMIx resources or cgroup resources. Explicit cleanup is required by calling g_pmix_server->DeregisterTask() and g_cg_mgr->ReleaseCgroupByTaskIdOnly() before returning from the function.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/TaskScheduler.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-06-30T08:43:44.470Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 537
File: src/Craned/TaskManager.cpp:145-147
Timestamp: 2025-06-30T08:43:44.470Z
Learning: In the CraneSched codebase, src/Craned/Craned.cpp guarantees that g_config.CranedRes.contains(g_config.CranedIdOfThisNode) through explicit validation during startup. The code checks if g_config.CranedRes.contains(g_config.Hostname) and exits the process if not found, then sets g_config.CranedIdOfThisNode = g_config.Hostname. TaskManager constructor is called after this validation, so g_config.CranedRes[g_config.CranedIdOfThisNode] is guaranteed to be valid.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/CtldPublicDefs.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-04-02T09:30:13.014Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 461
File: src/Craned/JobManager.cpp:141-149
Timestamp: 2025-04-02T09:30:13.014Z
Learning: In JobManager, if a uid exists in m_uid_to_job_ids_map_, its corresponding task_ids set is guaranteed to be non-empty due to the invariant maintained in AllocJobs and FreeJobs methods.

Applied to files:

  • src/CraneCtld/DbClient.cpp
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-09T01:56:53.142Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/CtldPublicDefs.h:756-763
Timestamp: 2025-05-09T01:56:53.142Z
Learning: In the CraneSched codebase, the `execution_node` field in JobToD is intentionally set to the first element of `executing_craned_ids` vector without guards, as it represents the main execution node for a job. This is by design and assumes `executing_craned_ids` is never empty when `GetJobOfNode` is called.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.cpp
📚 Learning: 2025-12-08T08:11:40.323Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 727
File: src/Craned/Supervisor/TaskManager.cpp:1401-1407
Timestamp: 2025-12-08T08:11:40.323Z
Learning: In src/Craned/Supervisor/TaskManager.cpp, StepInstance::Prepare() is always called before any task's Spawn() method in the execution flow (via EvGrpcExecuteTaskCb_). If Prepare() fails, tasks are immediately finished and Spawn() is never invoked. For Crun tasks, StepInstance::Prepare() guarantees that x11_meta is set (even if x11 is false), so accessing x11_meta.value() in Spawn() when both IsCrun() and x11 are true is safe without additional has_value() checks.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.cpp
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-23T02:32:43.952Z
Learnt from: 1daidai1
Repo: PKUHPC/CraneSched PR: 458
File: src/CraneCtld/CtldPublicDefs.h:0-0
Timestamp: 2025-05-23T02:32:43.952Z
Learning: In the CraneSched project, allocated_res_view is handled/updated separately before calling SetAllocatedRes, so it does not need to be updated again within the method itself.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.h
📚 Learning: 2025-05-23T02:32:43.952Z
Learnt from: 1daidai1
Repo: PKUHPC/CraneSched PR: 458
File: src/CraneCtld/CtldPublicDefs.h:0-0
Timestamp: 2025-05-23T02:32:43.952Z
Learning: In the CraneSched project, allocated_res_view is updated before calling SetAllocatedRes, so it doesn't need to be updated again within the method itself.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.h
📚 Learning: 2025-05-09T02:15:30.422Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/TaskScheduler.cpp:1713-1721
Timestamp: 2025-05-09T02:15:30.422Z
Learning: TaskScheduler中的资源回收设计:当Craned节点离线时,通过TaskStatusChangeAsync异步处理任务状态变更和资源回收。尽管可能存在短暂的资源更新延迟,但这是设计上可接受的权衡。

Applied to files:

  • src/CraneCtld/TaskScheduler.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-26T11:00:54.563Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 399
File: src/Craned/Craned/SupervisorKeeper.cpp:174-178
Timestamp: 2025-05-26T11:00:54.563Z
Learning: The CraneSched project uses C++23 standard, allowing the use of modern C++ features like std::ranges::to and other C++23 language features and library components.

Applied to files:

  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-05-08T07:39:29.207Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/TaskManager.cpp:1162-1175
Timestamp: 2025-05-08T07:39:29.207Z
Learning: In CraneSched, when a task fails in the LaunchTaskInstanceMt_ method, ActivateTaskStatusChangeAsync_ does not automatically release PMIx resources or cgroup. Explicit cleanup is required by calling g_pmix_server->DeregisterTask() and g_cg_mgr->ReleaseCgroupByTaskIdOnly().

Applied to files:

  • src/CraneCtld/TaskScheduler.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-08T09:35:39.809Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/Pmix/PmixCollRing.cpp:0-0
Timestamp: 2025-05-08T09:35:39.809Z
Learning: In the PMIx implementation for CraneSched, objects referenced in asynchronous gRPC callbacks (like `coll_ring_ctx`) remain valid as long as the parent object (`this`) is not destroyed. The `Coll` class uses shared_ptr management to ensure its lifetime extends through all pending callbacks.

Applied to files:

  • src/CraneCtld/TaskScheduler.h
🧬 Code graph analysis (4)
src/CraneCtld/DbClient.h (1)
src/CraneCtld/DbClient.cpp (2)
  • FetchJobStatus (498-548)
  • FetchJobStatus (500-500)
src/CraneCtld/DbClient.cpp (2)
src/Craned/Supervisor/TaskManager.h (1)
  • job_id (51-51)
src/Craned/Supervisor/SupervisorPublicDefs.h (1)
  • exit_code (37-37)
src/CraneCtld/CtldPublicDefs.h (1)
src/CraneCtld/CtldPublicDefs.cpp (10)
  • update (122-137)
  • update (122-122)
  • SetDependency (989-1000)
  • SetDependency (989-989)
  • UpdateDependency (1002-1004)
  • UpdateDependency (1002-1002)
  • AddDependent (1006-1015)
  • AddDependent (1006-1007)
  • TriggerDependencyEvents (1017-1022)
  • TriggerDependencyEvents (1017-1018)
src/CraneCtld/TaskScheduler.cpp (1)
src/CraneCtld/TaskScheduler.h (1)
  • AddDependencyEvent (881-887)
🪛 Cppcheck (2.18.0)
src/CraneCtld/DbClient.cpp

[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)

src/CraneCtld/CtldPublicDefs.cpp

[information] 22-22: Include file

(missingIncludeSystem)


[information] 22-22: Include file

(missingInclude)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)

src/CraneCtld/TaskScheduler.cpp

[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)

🔇 Additional comments (13)
src/CraneCtld/DbClient.h (1)

122-125: New FetchJobStatus API signature looks consistent

Return type and parameters match the implementation and existing job-related methods; the comment clearly documents the tuple layout. No issues from the header side.

protos/PublicDefs.proto (1)

135-162: Dependency proto extensions are structurally sound and backward‑compatible

The new DependencyType/DependencyCondition/Dependencies/DependencyStatus messages and the added fields on TaskToCtld (15) and TaskInfo (29) use fresh field numbers and match the dependency model implemented in CtldPublicDefs (OR/AND via is_or plus ready-time and infinite sentinels). New error codes ERR_MISSING_DEPENDENCY and ERR_DB_INSERT_FAILED fit naturally into ErrCode without breaking existing values.

No structural or compatibility issues spotted here.

Also applies to: 186-187, 482-485, 730-731

src/CraneCtld/RpcService/CtldGrpcServer.cpp (1)

301-315: Nested CraneExpected handling in CforedStream looks correct

The new unwrapping of CraneExpected<std::future<CraneExpected<task_id_t>>> correctly distinguishes:

  • outer submission errors (submit_result.error()),
  • async scheduling errors (task_result.error()),
    and converts both to a std::expected<task_id_t, std::string> via CraneErrStr for WriteTaskIdReply. That’s consistent with the rest of the error plumbing in this file.
src/CraneCtld/CtldPublicDefs.cpp (4)

22-22: Include dependency on TaskScheduler.h looks appropriate

Including TaskScheduler.h here is reasonable now that TaskInCtld calls g_task_scheduler in its dependency helpers; no issues spotted.


122-137: DependenciesInJob::update logic is consistent with OR/AND semantics

The implementation correctly:

  • Finds the dependee, logs and returns if missing.
  • Computes dep_ready_time = event_time + delay_seconds.
  • Aggregates via min for OR and max for AND, then erases the processed entry.

This matches the usual “earliest satisfied OR / latest satisfied AND” pattern and prepares state for is_met/is_failed without obvious edge-case bugs.


1083-1084: Initializing dependency state from TaskToCtld is consistent

Wiring SetDependency(val.dependencies()) into SetFieldsByTaskToCtld is the right place to initialize per‑task dependency state during both new submissions and recovery. This keeps all proto→in‑memory mapping centralized.


1178-1199: TaskInfo.dependency_status serialization appears correct

The new dependency_status block:

  • Only emits when there are outstanding deps or a non‑default ready_time.
  • Reports is_or, a list of still‑pending conditions, and either ready_time or the infinite‑past/future sentinels.

This matches the internal DependenciesInJob model and should be sufficient for front‑end/CLI displays to distinguish “waiting for deps” vs “never satisfiable” (assuming is_failed() is implemented consistently).

src/CraneCtld/TaskScheduler.h (1)

764-768: API and queue changes for dependency-aware, error-aware submission look coherent

  • Changing SubmitTaskAsync to return std::future<CraneExpected<task_id_t>> and SubmitTaskToScheduler to return CraneExpected<std::future<CraneExpected<task_id_t>>> cleanly separates synchronous validation errors from async submission/DB failures.
  • Updating m_submit_task_queue_ to carry std::promise<CraneExpected<task_id_t>> is consistent with the new return type, and all call sites in the cpp now set either an id or an std::unexpected value.
  • The DependencyEvent struct plus m_dependency_event_queue_ and AddDependencyEvent inline helper provide a simple multi‑producer/single‑consumer channel from TaskInCtld back into the scheduler loop. Using absl::Time here matches the rest of the scheduling code.

I don’t see interface or lifetime issues with these changes.

Also applies to: 779-781, 880-887, 1026-1028, 1068-1075

src/CraneCtld/TaskScheduler.cpp (5)

843-859: Dependency event consumption and scheduling gating

Inside ScheduleThread_:

  • Dequeuing DependencyEvents into a local vector and calling UpdateDependency while holding m_pending_task_map_mtx_ ensures that dependency state for pending jobs is updated atomically with respect to scheduling.
  • The subsequent check:
if (!job->Dependencies().is_met(now)) {
  if (job->Dependencies().is_failed())
    job->pending_reason = "DependencyNeverSatisfied";
  else
    job->pending_reason = "Dependency";
  continue;
}

cleanly separates “still waiting” vs “logically impossible” dependencies at the scheduler level.

Assuming DependenciesInJob::is_met() / is_failed() interpret ready_time == InfiniteFuture() as “never satisfiable”, this should behave as expected.

Also applies to: 872-879


1290-1292: Triggering AFTER events on start-time is consistent

Calling:

job->TriggerDependencyEvents(crane::grpc::DependencyType::AFTER,
                             job->StartTime());

when jobs transition to running is the right place to satisfy AFTER dependencies, and using the actual start time preserves correct behavior for delayed delay_seconds constraints (including jobs that were already started before a dependent was submitted and attached via AddDependent).


1376-1385: New SubmitTaskAsync implementation matches queue type

The new SubmitTaskAsync correctly:

  • Constructs a std::promise<CraneExpected<task_id_t>>,
  • Enqueues {std::unique_ptr<TaskInCtld>, promise} into m_submit_task_queue_, and
  • Returns the associated std::future<CraneExpected<task_id_t>>.

No obvious ownership or lifetime issues; the promise is always either fulfilled in CleanSubmitQueueCb_ or in one of the error branches.


2528-2646: Runtime dependency wiring and missing-dependency rejection on submission

The changes in CleanSubmitQueueCb_ do several important things:

  • Switch SubmitQueueElem to use std::promise<CraneExpected<task_id_t>> and correctly fulfill it with either id or std::unexpected(err).
  • For each accepted task, attach it as a dependent to any dependees found in m_pending_task_map_ or m_running_task_map_ via AddDependent, under the appropriate locks.
  • If any dependee is missing from both pending and running maps, log a warning, free QoS, fulfill the promise with ERR_MISSING_DEPENDENCY, and record the job for purging from the embedded DB.

This ensures:

  • All dependency edges are hooked up before the job becomes visible in the pending queue.
  • Submissions with obviously invalid dependency references are rejected quickly and cleaned up in both RAM and the embedded DB.

One behavioral nuance to be aware of: unlike the recovery path, new submissions do not consult MongoDB for already‑completed dependees; they are simply rejected as “missing dependency”. If you expect users to be able to submit “after X” jobs where X has already finished, this will currently fail. If that’s not supported by design, the behavior here is consistent, but it may be worth making that explicit in user‑facing docs.


2783-2796: Status-change driven AFTER_* event triggering

In CleanTaskStatusChangeQueueCb_, when a job finishes you now:

  • Compute an accurate end_time from the google::protobuf::Timestamp provided by Craned and call task->SetEndTime(end_time).
  • Trigger:
task->TriggerDependencyEvents(AFTER_ANY, end_time);
task->TriggerDependencyEvents(AFTER_OK,
    task_exit_code == 0 ? end_time : absl::InfiniteFuture());
task->TriggerDependencyEvents(AFTER_NOT_OK,
    task_exit_code != 0 ? end_time : absl::InfiniteFuture());

This nicely maps completion semantics into the dependency model:

  • AFTER_ANY: always signaled at actual end time.
  • AFTER_OK: only signaled for zero exit code; otherwise the InfiniteFuture timestamp lets DependenciesInJob treat it as never satisfiable.
  • AFTER_NOT_OK: symmetric to the above.

The only gap, as noted earlier, is that pending‑state cancellations never pass through this path, so AFTER_NOT_OK on those jobs requires special handling if it’s meant to be satisfiable.

Comment on lines +498 to +743
std::unordered_map<
task_id_t, std::tuple<crane::grpc::TaskStatus, uint32_t, int64_t, int64_t>>
MongodbClient::FetchJobStatus(const std::unordered_set<task_id_t>& job_ids) {
std::unordered_map<task_id_t, std::tuple<crane::grpc::TaskStatus, uint32_t,
int64_t, int64_t>>
result;

if (job_ids.empty()) {
return result;
}

try {
document filter;
filter.append(kvp("task_id", [&job_ids](sub_document task_id_doc) {
array task_id_array;
for (const auto& job_id : job_ids) {
task_id_array.append(static_cast<std::int32_t>(job_id));
}
task_id_doc.append(kvp("$in", task_id_array));
}));

mongocxx::options::find options;
document projection;
projection.append(kvp("task_id", 1));
projection.append(kvp("state", 1));
projection.append(kvp("exit_code", 1));
projection.append(kvp("time_end", 1));
projection.append(kvp("time_start", 1));
options.projection(projection.view());

mongocxx::cursor cursor =
(*GetClient_())[m_db_name_][m_task_collection_name_].find(filter.view(),
options);

for (auto view : cursor) {
task_id_t task_id = view["task_id"].get_int32().value;
crane::grpc::TaskStatus status =
static_cast<crane::grpc::TaskStatus>(view["state"].get_int32().value);
uint32_t exit_code = view["exit_code"].get_int32().value;
int64_t time_end = view["time_end"].get_int64().value;
int64_t time_start = view["time_start"].get_int64().value;

result.emplace(task_id,
std::make_tuple(status, exit_code, time_end, time_start));
}
} catch (const std::exception& e) {
CRANE_ERROR("Failed to fetch job status by IDs: {}", e.what());
}

return result;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard FetchJobStatus against step-only documents (missing has_job_info)

The query currently filters only on task_id, so it can match documents created by InsertSteps that contain just task_id and steps but lack state/exit_code/time_start/time_end. Accessing those fields will throw, hit the catch, and you’ll return an empty/partial map indistinguishable from “no rows”.

To align with FetchJobRecords() and avoid this silent drop, consider restricting to full job docs:

  try {
-    document filter;
-    filter.append(kvp("task_id", [&job_ids](sub_document task_id_doc) {
+    document filter;
+    // Only query documents with complete job information.
+    filter.append(kvp("has_job_info", true));
+    filter.append(kvp("task_id", [&job_ids](sub_document task_id_doc) {
       array task_id_array;
       for (const auto& job_id : job_ids) {
         task_id_array.append(static_cast<std::int32_t>(job_id));
       }
       task_id_doc.append(kvp("$in", task_id_array));
     }));

If you also need to distinguish “query failed” from “no such tasks”, you might later wrap the result in a CraneExpected<...> instead of returning a bare map, but that’s optional for this PR.

🧰 Tools
🪛 Cppcheck (2.18.0)

[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)

🤖 Prompt for AI Agents
In src/CraneCtld/DbClient.cpp around lines 498-548, FetchJobStatus currently
queries only by task_id and can match step-only documents that lack job fields,
causing field access to throw and return an indistinguishable empty/partial map;
update the MongoDB filter to only select full job documents (e.g., add a
condition such as "has_job_info": true or require existence of
"state"/"time_start"/"time_end" using $exists/$type) so only documents
containing state/exit_code/time_start/time_end are returned, and keep the
existing projection and field reads unchanged.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
src/CraneCtld/DbClient.cpp (1)

498-548: Guard FetchJobStatus against step-only documents

The query filters only on task_id without ensuring the document has complete job info. Documents created by InsertSteps only have task_id and steps array but lack state, exit_code, time_start, time_end. Accessing these fields on such documents will throw an exception, causing the function to return an incomplete map.

Add has_job_info: true to the filter to match only complete job documents, consistent with FetchJobRecords():

   try {
     document filter;
+    filter.append(kvp("has_job_info", true));
     filter.append(kvp("task_id", [&job_ids](sub_document task_id_doc) {
       array task_id_array;
       for (const auto& job_id : job_ids) {
         task_id_array.append(static_cast<std::int32_t>(job_id));
       }
       task_id_doc.append(kvp("$in", task_id_array));
     }));
src/CraneCtld/TaskScheduler.cpp (1)

2486-2495: Cancelled pending tasks still missing AFTER_NOT_OK events and exit code.

This is the same issue raised in previous reviews: when pending tasks are cancelled, they don't set a non-zero exit code or trigger AFTER_NOT_OK dependency events, leaving dependents stuck.

Compare to task completion (lines 2787-2795), which conditionally fires AFTER_OK and AFTER_NOT_OK based on exit_code. Apply the same logic here:

  1. Set task->SetExitCode(ExitCode::EC_TERMINATED) (or an appropriate cancellation code) before line 2489.
  2. Add task->TriggerDependencyEvents(crane::grpc::DependencyType::AFTER_NOT_OK, end_time) after line 2494.
🧹 Nitpick comments (3)
src/CraneCtld/CtldPublicDefs.cpp (2)

22-22: Consider reducing coupling between CtldPublicDefs and TaskScheduler.

The include creates a circular dependency between these headers (CtldPublicDefs.h ← TaskScheduler.h ← CtldPublicDefs.h). While this works due to include guards, it tightly couples the modules. Consider one of these refactors for future work:

  • Inject the event queue or callback into TaskInCtld rather than accessing g_task_scheduler directly.
  • Use an observer/listener pattern where TaskInCtld notifies registered listeners.
  • Move dependency event triggering to TaskScheduler itself.

122-137: Consider improving error handling in dependency updates.

If a dependency is not found in the deps map (Line 124), the method logs an error but continues without indicating failure to the caller. This could leave dependency state inconsistent—the task won't know a dependency update was missed.

Consider returning a boolean success indicator or using CraneExpected to allow callers to detect and handle missing dependencies (e.g., by querying the database or failing the dependent task).

src/CraneCtld/TaskScheduler.cpp (1)

448-459: Verify AFTER dependency semantics for cancelled tasks without start time.

For AFTER dependencies, lines 452-454 handle cancelled tasks by using time_end, but the logic in lines 450-458 suggests that if a dependee was cancelled before it started (i.e., time_start == 0), the event time falls through to the error case (line 455-457).

Confirm whether AFTER dependencies on a pending‑cancelled job (no time_start) should:

  • Fire immediately using time_end (current behavior for status == Cancelled),
  • Never satisfy (remain at InfiniteFuture), or
  • Be treated as an error condition.

If the intent is that AFTER fires when the dependee starts or is cancelled before starting, document this explicitly.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c38d42e and da432fb.

📒 Files selected for processing (14)
  • docs/en/command/calloc.md (1 hunks)
  • docs/en/command/cbatch.md (1 hunks)
  • docs/en/command/crun.md (1 hunks)
  • docs/zh/command/calloc.md (1 hunks)
  • docs/zh/command/cbatch.md (1 hunks)
  • docs/zh/command/crun.md (1 hunks)
  • protos/PublicDefs.proto (4 hunks)
  • src/CraneCtld/CtldPublicDefs.cpp (5 hunks)
  • src/CraneCtld/CtldPublicDefs.h (4 hunks)
  • src/CraneCtld/DbClient.cpp (1 hunks)
  • src/CraneCtld/DbClient.h (1 hunks)
  • src/CraneCtld/RpcService/CtldGrpcServer.cpp (4 hunks)
  • src/CraneCtld/TaskScheduler.cpp (13 hunks)
  • src/CraneCtld/TaskScheduler.h (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
🧰 Additional context used
🧠 Learnings (16)
📚 Learning: 2025-05-09T01:56:53.142Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/CtldPublicDefs.h:756-763
Timestamp: 2025-05-09T01:56:53.142Z
Learning: In the CraneSched codebase, the `execution_node` field in JobToD is intentionally set to the first element of `executing_craned_ids` vector without guards, as it represents the main execution node for a job. This is by design and assumes `executing_craned_ids` is never empty when `GetJobOfNode` is called.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.cpp
  • src/CraneCtld/DbClient.cpp
📚 Learning: 2025-12-08T08:11:40.323Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 727
File: src/Craned/Supervisor/TaskManager.cpp:1401-1407
Timestamp: 2025-12-08T08:11:40.323Z
Learning: In src/Craned/Supervisor/TaskManager.cpp, StepInstance::Prepare() is always called before any task's Spawn() method in the execution flow (via EvGrpcExecuteTaskCb_). If Prepare() fails, tasks are immediately finished and Spawn() is never invoked. For Crun tasks, StepInstance::Prepare() guarantees that x11_meta is set (even if x11 is false), so accessing x11_meta.value() in Spawn() when both IsCrun() and x11 are true is safe without additional has_value() checks.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.cpp
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-09T02:15:30.422Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/TaskScheduler.cpp:1713-1721
Timestamp: 2025-05-09T02:15:30.422Z
Learning: TaskScheduler中的资源回收设计:当Craned节点离线时,通过TaskStatusChangeAsync异步处理任务状态变更和资源回收。尽管可能存在短暂的资源更新延迟,但这是设计上可接受的权衡。

Applied to files:

  • src/CraneCtld/TaskScheduler.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-26T11:00:54.563Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 399
File: src/Craned/Craned/SupervisorKeeper.cpp:174-178
Timestamp: 2025-05-26T11:00:54.563Z
Learning: The CraneSched project uses C++23 standard, allowing the use of modern C++ features like std::ranges::to and other C++23 language features and library components.

Applied to files:

  • src/CraneCtld/TaskScheduler.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-08-12T08:58:39.772Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 577
File: src/Craned/Supervisor/TaskManager.cpp:124-130
Timestamp: 2025-08-12T08:58:39.772Z
Learning: In the CraneSched project using C++23, Class Template Argument Deduction (CTAD) allows std::unique_ptr declarations without explicit template parameters when the type can be deduced from the initializer, such as `std::unique_ptr task = std::move(m_task_map_.at(task_id))` where the template parameter is deduced from the move operation.

Applied to files:

  • src/CraneCtld/TaskScheduler.h
  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/CtldPublicDefs.h
📚 Learning: 2025-05-08T07:39:29.207Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/TaskManager.cpp:1162-1175
Timestamp: 2025-05-08T07:39:29.207Z
Learning: In CraneSched, when a task fails in the LaunchTaskInstanceMt_ method after PMIx registration, ActivateTaskStatusChangeAsync_ does not automatically release PMIx resources or cgroup resources. Explicit cleanup is required by calling g_pmix_server->DeregisterTask() and g_cg_mgr->ReleaseCgroupByTaskIdOnly() before returning from the function.

Applied to files:

  • src/CraneCtld/TaskScheduler.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-08T07:39:29.207Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/TaskManager.cpp:1162-1175
Timestamp: 2025-05-08T07:39:29.207Z
Learning: In CraneSched, when a task fails in the LaunchTaskInstanceMt_ method, ActivateTaskStatusChangeAsync_ does not automatically release PMIx resources or cgroup. Explicit cleanup is required by calling g_pmix_server->DeregisterTask() and g_cg_mgr->ReleaseCgroupByTaskIdOnly().

Applied to files:

  • src/CraneCtld/TaskScheduler.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-09T01:54:21.256Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/CranedKeeper.cpp:51-53
Timestamp: 2025-05-09T01:54:21.256Z
Learning: The ConfigureCraned function in src/CraneCtld/CranedKeeper.cpp is called from a thread pool, so there's no need to worry about it blocking the gRPC completion queue thread.

Applied to files:

  • src/CraneCtld/TaskScheduler.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-26T11:04:30.580Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 399
File: src/Craned/Craned/SupervisorKeeper.h:32-32
Timestamp: 2025-05-26T11:04:30.580Z
Learning: The Supervisor component in the Crane system is designed to manage only one task per instance. The task specification is provided to the Supervisor during startup by Craned (likely through InitSupervisorRequest), so the ExecuteTask() method doesn't need to accept task parameters since the Supervisor already knows which task to execute.

Applied to files:

  • src/CraneCtld/TaskScheduler.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-08T09:35:39.809Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/Pmix/PmixCollRing.cpp:0-0
Timestamp: 2025-05-08T09:35:39.809Z
Learning: In the PMIx implementation for CraneSched, objects referenced in asynchronous gRPC callbacks (like `coll_ring_ctx`) remain valid as long as the parent object (`this`) is not destroyed. The `Coll` class uses shared_ptr management to ensure its lifetime extends through all pending callbacks.

Applied to files:

  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-06-30T08:43:44.470Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 537
File: src/Craned/TaskManager.cpp:145-147
Timestamp: 2025-06-30T08:43:44.470Z
Learning: In the CraneSched codebase, src/Craned/Craned.cpp guarantees that g_config.CranedRes.contains(g_config.CranedIdOfThisNode) through explicit validation during startup. The code checks if g_config.CranedRes.contains(g_config.Hostname) and exits the process if not found, then sets g_config.CranedIdOfThisNode = g_config.Hostname. TaskManager constructor is called after this validation, so g_config.CranedRes[g_config.CranedIdOfThisNode] is guaranteed to be valid.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/CtldPublicDefs.h
📚 Learning: 2025-05-25T04:11:50.268Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 399
File: src/Craned/Supervisor/TaskManager.cpp:220-220
Timestamp: 2025-05-25T04:11:50.268Z
Learning: In TaskManager.cpp, when step->IsCrun() is checked before dynamic_cast to CrunMetaInExecution*, the cast is guaranteed to succeed due to the program logic ensuring the correct meta type is used for Crun tasks. Null checks for these dynamic_cast operations are unnecessary in this context.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-04-02T09:30:13.014Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 461
File: src/Craned/JobManager.cpp:141-149
Timestamp: 2025-04-02T09:30:13.014Z
Learning: In JobManager, if a uid exists in m_uid_to_job_ids_map_, its corresponding task_ids set is guaranteed to be non-empty due to the invariant maintained in AllocJobs and FreeJobs methods.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/DbClient.cpp
📚 Learning: 2025-05-23T02:32:43.952Z
Learnt from: 1daidai1
Repo: PKUHPC/CraneSched PR: 458
File: src/CraneCtld/CtldPublicDefs.h:0-0
Timestamp: 2025-05-23T02:32:43.952Z
Learning: In the CraneSched project, allocated_res_view is handled/updated separately before calling SetAllocatedRes, so it does not need to be updated again within the method itself.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.h
📚 Learning: 2025-05-23T02:32:43.952Z
Learnt from: 1daidai1
Repo: PKUHPC/CraneSched PR: 458
File: src/CraneCtld/CtldPublicDefs.h:0-0
Timestamp: 2025-05-23T02:32:43.952Z
Learning: In the CraneSched project, allocated_res_view is updated before calling SetAllocatedRes, so it doesn't need to be updated again within the method itself.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.h
📚 Learning: 2025-04-02T10:11:33.562Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 461
File: src/Craned/CgroupManager.cpp:685-685
Timestamp: 2025-04-02T10:11:33.562Z
Learning: In the CgroupManager's GetJobBpfMapCgroupsV2 method, the developer has confirmed that cg_ino_job_id_map will always contain the key->cgroup_id element, making the CRANE_ASSERT check appropriate rather than requiring additional error handling.

Applied to files:

  • src/CraneCtld/DbClient.cpp
🧬 Code graph analysis (3)
src/CraneCtld/TaskScheduler.h (1)
src/CraneCtld/TaskScheduler.cpp (4)
  • SubmitTaskAsync (1376-1385)
  • SubmitTaskAsync (1376-1377)
  • SubmitTaskToScheduler (1521-1588)
  • SubmitTaskToScheduler (1522-1522)
src/CraneCtld/TaskScheduler.cpp (1)
src/CraneCtld/TaskScheduler.h (1)
  • AddDependencyEvent (881-887)
src/CraneCtld/CtldPublicDefs.h (1)
src/CraneCtld/CtldPublicDefs.cpp (10)
  • update (122-137)
  • update (122-122)
  • SetDependency (989-1000)
  • SetDependency (989-989)
  • UpdateDependency (1002-1004)
  • UpdateDependency (1002-1002)
  • AddDependent (1006-1015)
  • AddDependent (1006-1007)
  • TriggerDependencyEvents (1017-1022)
  • TriggerDependencyEvents (1017-1018)
🪛 Cppcheck (2.18.0)
src/CraneCtld/CtldPublicDefs.cpp

[information] 22-22: Include file

(missingIncludeSystem)


[information] 22-22: Include file

(missingInclude)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)

src/CraneCtld/TaskScheduler.cpp

[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)


[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)

src/CraneCtld/DbClient.cpp

[information] Limiting analysis of branches. Use --check-level=exhaustive to analyze all branches.

(normalCheckLevelMaxBranches)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (29)
docs/zh/command/calloc.md (1)

37-37: LGTM!

The new -d, --dependency option is well documented with clear format specifications and supported dependency types.

docs/zh/command/crun.md (1)

15-15: LGTM!

Documentation for the dependency option is consistent with the other command documentation in this PR.

docs/en/command/calloc.md (1)

37-37: LGTM!

The dependency option documentation is clear and consistent with the Chinese documentation.

docs/en/command/crun.md (1)

15-15: LGTM!

Documentation is consistent with other command documentation files.

docs/en/command/cbatch.md (1)

72-72: LGTM!

The dependency option is appropriately placed under Scheduling Options and documented consistently with other commands.

docs/zh/command/cbatch.md (1)

72-72: LGTM!

Chinese documentation is consistent with the English version.

src/CraneCtld/DbClient.h (1)

122-125: LGTM! Clear API for dependency resolution.

The method signature follows existing patterns in MongodbClient and provides the necessary status information for the dependency subsystem to resolve missing dependee tasks.

protos/PublicDefs.proto (4)

135-162: Well-designed dependency modeling.

The dependency types align with standard scheduler semantics (similar to Slurm's dependency options), and the use of is_or to distinguish AND/OR logic is clear. The ready_time oneof handling infinite future/past states elegantly represents pending/failed conditions without requiring sentinel values.


186-186: LGTM! Dependency field added to task submission.

Field number and placement are appropriate.


485-485: LGTM! Exposes dependency status for client queries.

The optional field allows tasks without dependencies to omit the status, reducing message size.


730-731: LGTM! Error codes support dependency handling.

Sequential numbering and clear names for dependency-related error cases.

src/CraneCtld/CtldPublicDefs.h (4)

414-430: Dependency state logic is sound.

The is_met() and is_failed() methods correctly implement AND/OR semantics:

  • AND dependencies: All must succeed (deps empty + ready_time reached), any failure causes task failure (ready_time = InfiniteFuture).
  • OR dependencies: Any success satisfies (ready_time = earliest), all must fail to cause task failure (deps empty + InfiniteFuture).

The update() method signature indicates it will process events and maintain consistency between deps map and ready_time.


735-735: LGTM! Efficient dependent tracking.

Using an array indexed by DependencyType provides O(1) access when triggering dependency events for each condition type.


748-748: LGTM! Per-task dependency state.

Stores the dependency conditions this task must satisfy before running.


923-928: LGTM! Complete dependency management API.

The methods cover the full dependency lifecycle: initialization from submission data, event processing, dependent registration, and event propagation on status changes.

src/CraneCtld/CtldPublicDefs.cpp (4)

989-1000: LGTM! Correct dependency initialization.

The initialization logic correctly sets ready_time to InfiniteFuture for OR dependencies (will be reduced to earliest satisfied time) and InfinitePast for AND dependencies (will be increased to latest satisfied time).


1002-1004: LGTM! Clean delegation.

Delegates to DependenciesInJob::update() as expected.


1017-1022: LGTM! Dependency event propagation.

Enqueues events for all registered dependents. The use of g_task_scheduler->AddDependencyEvent() relies on the global being initialized at startup, which is consistent with the system's initialization order.


1178-1199: LGTM! Efficient dependency status serialization.

Only serializes dependency status when present, and correctly handles the three ready_time states (normal timestamp, infinite future for pending OR, infinite past for initial AND) using the protobuf oneof.

src/CraneCtld/TaskScheduler.h (4)

764-767: LGTM! Improved error handling via CraneExpected.

Wrapping the result in CraneExpected<task_id_t> enables callers to distinguish between successful submission (with new task ID) and failures (e.g., missing dependencies, validation errors). This is essential for the dependency subsystem to provide meaningful feedback.


779-780: LGTM! Two-stage error handling for async submission.

The nested CraneExpected structure correctly models two error stages:

  1. Synchronous validation (outer): Authorization, resource checks, etc.
  2. Asynchronous processing (inner): DB insertion, dependency resolution, etc.

This aligns with the async submission workflow.


880-887: LGTM! Clean dependency event API.

The inline implementation is straightforward, and the comment documenting the InfiniteFuture convention for failed dependencies is helpful for maintainability.


1068-1074: LGTM! Event queue for dependency propagation.

The DependencyEvent structure captures the necessary information (which task depends on which, when it was satisfied), and ConcurrentQueue ensures thread-safe async processing.

src/CraneCtld/TaskScheduler.cpp (6)

843-859: LGTM! Dependency event processing is correct.

The batch dequeue and update loop properly propagates dependency events to pending tasks.


872-879: LGTM! Dependency-aware scheduling guard is correct.

Tasks with unsatisfied dependencies are correctly skipped with clear pending reasons.


1290-1292: LGTM! AFTER dependency event triggered at task start.

Correctly fires AFTER dependency events when a task begins execution.


1376-1385: LGTM! CraneExpected refactoring is consistent and correct.

The signature changes properly wrap async submission results in CraneExpected for improved error propagation. The nested CraneExpected<std::future<CraneExpected<task_id_t>>> correctly separates immediate validation errors (outer) from async submission errors (inner).

Also applies to: 1521-1588


2588-2646: Dependency validation and missing-dependency rejection logic is correct.

The flow properly:

  1. Persists tasks to embedded DB (line 2575-2584)
  2. Validates dependencies against pending/running maps (lines 2595-2628)
  3. Rejects tasks with missing dependencies (lines 2620-2627) by setting ERR_MISSING_DEPENDENCY and skipping pending-map insertion
  4. Purges rejected tasks from embedded DB (lines 2638-2645)

Note: Tasks are written to embedded DB before dependency validation, so rejected tasks require cleanup. While this could be optimized by checking dependencies first, the current approach is acceptable given that missing dependencies should be rare.


2783-2795: LGTM! Task completion dependency event handling is the correct pattern.

This properly triggers all dependency types based on exit code:

  • AFTER_ANY fires unconditionally with end_time
  • AFTER_OK fires only when exit_code == 0
  • AFTER_NOT_OK fires only when exit_code != 0

Using InfiniteFuture for unsatisfied conditions correctly prevents dependent tasks from proceeding. This is the pattern that should be applied to cancelled pending tasks (lines 2486-2495).

@github-actions github-actions bot added the test-passed Build and test success label Dec 9, 2025
@NamelessOIer NamelessOIer linked an issue Dec 10, 2025 that may be closed by this pull request
@NamelessOIer NamelessOIer added the enhancement New feature or request label Dec 10, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
docs/zh/reference/job_dependency.md (1)

27-29: Specify language identifiers on fenced code blocks.

Code blocks at lines 27 and 164 are missing language specifications. Add text identifier for syntax format and output examples.

- ```
+ ```text
  <type>:<job_id>[+<delay>][:<job_id>[+<delay>]]...
- ```
+ ```text

and

- ```
+ ```text
  JobId=105
  ...
  Dependency=PendingDependencies=afterok:100+01:00:00 Status=WaitForAll
- ```
+ ```text

Also applies to: 164-168

docs/zh/reference/pending_reason.md (1)

16-22: Specify language identifiers on fenced code blocks.

Code blocks at lines 16 and 31 are missing language specifications. Add text identifier for command outputs.

- ```
+ ```text
  JOBID    PARTITION  NAME     USER   ST   TIME     NODES  NODELIST(REASON)
  101      CPU        job1     user1  PD   0:00     2      (Priority)
  102      CPU        job2     user1  PD   0:00     4      (Resource)
  103      GPU        job3     user2  PD   0:00     1      (Dependency)
  104      CPU        job4     user1  PD   0:00     2      (Held)
- ```
+ ```text

and

- ```
+ ```text
  JobId=101
  ...
  State=PENDING
  Reason=Priority
- ```
+ ```text

Also applies to: 31-36

docs/en/reference/pending_reason.md (1)

16-22: Specify language identifiers on fenced code blocks.

Code blocks at lines 16 and 31 are missing language specifications. Add text identifier for command outputs.

- ```
+ ```text
  JOBID    PARTITION  NAME     USER   ST   TIME     NODES  NODELIST(REASON)
  101      CPU        job1     user1  PD   0:00     2      (Priority)
  102      CPU        job2     user1  PD   0:00     4      (Resource)
  103      GPU        job3     user2  PD   0:00     1      (Dependency)
  104      CPU        job4     user1  PD   0:00     2      (Held)
- ```
+ ```text

and

- ```
+ ```text
  JobId=101
  ...
  State=PENDING
  Reason=Priority
- ```
+ ```text

Also applies to: 31-36

docs/en/reference/job_dependency.md (1)

27-29: Specify language identifiers on fenced code blocks.

Code blocks at lines 27 and 164 are missing language specifications. Add text identifier for syntax format and output examples.

- ```
+ ```text
  <type>:<job_id>[+<delay>][:<job_id>[+<delay>]]...
- ```
+ ```text

and

- ```
+ ```text
  JobId=105
  ...
  Dependency=PendingDependencies=afterok:100+01:00:00 Status=WaitForAll
- ```
+ ```text

Also applies to: 164-168

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between da432fb and 2d3f0ec.

📒 Files selected for processing (4)
  • docs/en/reference/job_dependency.md (1 hunks)
  • docs/en/reference/pending_reason.md (1 hunks)
  • docs/zh/reference/job_dependency.md (1 hunks)
  • docs/zh/reference/pending_reason.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/zh/reference/job_dependency.md

[uncategorized] ~69-~69: 数词与名词之间一般应存在量词,可能缺少量词。
Context: ...100 开始 并且 作业 101 成功完成。 #### OR 逻辑(任一条件满足) 使用 ? 分隔不同的依赖条件: ```bash afterok:...

(wa5)

🪛 markdownlint-cli2 (0.18.1)
docs/en/reference/pending_reason.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


31-31: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

docs/zh/reference/job_dependency.md

27-27: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


164-164: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

docs/en/reference/job_dependency.md

27-27: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


164-164: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

docs/zh/reference/pending_reason.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


31-31: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

@github-actions github-actions bot added test-passed Build and test success and removed test-passed Build and test success labels Dec 10, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/CraneCtld/TaskScheduler.cpp (1)

2486-2498: Cancelled pending tasks still never satisfy AFTER_NOT_OK dependencies

For cancelled pending tasks you now:

  • set status = Cancelled and a common end_time,
  • fire AFTER and AFTER_ANY with end_time, and
  • fire AFTER_OK and AFTER_NOT_OK with absl::InfiniteFuture().

Given how DependenciesInJob::update aggregates ready_time, this means a dependent that specifies only AFTER_NOT_OK on a never-started job will end up with ready_time == InfiniteFuture and (assuming is_failed() looks for that) will be stuck in "DependencyNeverSatisfied" rather than ever running. That partially addresses the earlier issue (no longer silently stuck in plain "Dependency"), but it still treats a cancelled pending job as neither OK nor NOT_OK.

If the intended semantics are “AFTER_NOT_OK runs when the dependee finishes unsuccessfully, including cancellations”, you likely want to:

  • assign a non-zero exit code (e.g. ExitCode::EC_TERMINATED) to cancelled pending tasks, and
  • call TriggerDependencyEvents(DependencyType::AFTER_NOT_OK, end_time) (with AFTER_OK staying at InfiniteFuture()).

Otherwise, please document clearly that AFTER_NOT_OK only applies to jobs that actually started execution so users don’t expect it to trigger after cancelling a queued job.

🧹 Nitpick comments (2)
src/CraneCtld/CtldPublicDefs.cpp (1)

989-1000: Clear dependencies before re-populating from gRPC to avoid stale state

If TaskInCtld::SetDependency() is ever invoked more than once on the same instance (e.g. requeue/update), old entries remain for dependees that were dropped from the protobuf. Safer to reset before filling:

void TaskInCtld::SetDependency(const crane::grpc::Dependencies& grpc_deps) {
-  if (grpc_deps.is_or()) {
-    dependencies.is_or = true;
-    dependencies.ready_time = absl::InfiniteFuture();
-  } else {
-    dependencies.is_or = false;
-    dependencies.ready_time = absl::InfinitePast();
-  }
-  for (const auto& dep : grpc_deps.deps()) {
-    dependencies.deps[dep.job_id()] = {dep.type(), dep.delay_seconds()};
-  }
+  dependencies.deps.clear();
+  dependencies.is_or = grpc_deps.is_or();
+  dependencies.ready_time = grpc_deps.is_or() ? absl::InfiniteFuture()
+                                              : absl::InfinitePast();
+  for (const auto& dep : grpc_deps.deps()) {
+    dependencies.deps[dep.job_id()] = {dep.type(), dep.delay_seconds()};
+  }
}
src/CraneCtld/TaskScheduler.cpp (1)

1376-1385: CraneExpected-wrapped async submission path is wired correctly

SubmitTaskAsync now queues {std::unique_ptr<TaskInCtld>, std::promise<CraneExpected<task_id_t>>} and returns a std::future<CraneExpected<task_id_t>>, while SubmitTaskToScheduler validates the task, does QoS accounting, and on success returns that future wrapped in CraneExpected. This matches the updated CleanSubmitQueueCb_ logic (which fulfills the promise with either a task id or an error). Minor style nit: inside SubmitTaskToScheduler, calling SubmitTaskAsync(std::move(task)) on this instead of via g_task_scheduler would avoid another hard dependency on the global.

Also applies to: 1521-1585

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 148f59f and a05a02c.

📒 Files selected for processing (2)
  • src/CraneCtld/CtldPublicDefs.cpp (5 hunks)
  • src/CraneCtld/TaskScheduler.cpp (13 hunks)
🧰 Additional context used
🧠 Learnings (12)
📚 Learning: 2025-05-08T07:39:29.207Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/TaskManager.cpp:1162-1175
Timestamp: 2025-05-08T07:39:29.207Z
Learning: In CraneSched, when a task fails in the LaunchTaskInstanceMt_ method after PMIx registration, ActivateTaskStatusChangeAsync_ does not automatically release PMIx resources or cgroup resources. Explicit cleanup is required by calling g_pmix_server->DeregisterTask() and g_cg_mgr->ReleaseCgroupByTaskIdOnly() before returning from the function.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.cpp
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-08T07:39:29.207Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/TaskManager.cpp:1162-1175
Timestamp: 2025-05-08T07:39:29.207Z
Learning: In CraneSched, when a task fails in the LaunchTaskInstanceMt_ method, ActivateTaskStatusChangeAsync_ does not automatically release PMIx resources or cgroup. Explicit cleanup is required by calling g_pmix_server->DeregisterTask() and g_cg_mgr->ReleaseCgroupByTaskIdOnly().

Applied to files:

  • src/CraneCtld/CtldPublicDefs.cpp
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-06-30T08:43:44.470Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 537
File: src/Craned/TaskManager.cpp:145-147
Timestamp: 2025-06-30T08:43:44.470Z
Learning: In the CraneSched codebase, src/Craned/Craned.cpp guarantees that g_config.CranedRes.contains(g_config.CranedIdOfThisNode) through explicit validation during startup. The code checks if g_config.CranedRes.contains(g_config.Hostname) and exits the process if not found, then sets g_config.CranedIdOfThisNode = g_config.Hostname. TaskManager constructor is called after this validation, so g_config.CranedRes[g_config.CranedIdOfThisNode] is guaranteed to be valid.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.cpp
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-09T01:56:53.142Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/CtldPublicDefs.h:756-763
Timestamp: 2025-05-09T01:56:53.142Z
Learning: In the CraneSched codebase, the `execution_node` field in JobToD is intentionally set to the first element of `executing_craned_ids` vector without guards, as it represents the main execution node for a job. This is by design and assumes `executing_craned_ids` is never empty when `GetJobOfNode` is called.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.cpp
📚 Learning: 2025-12-08T08:11:40.323Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 727
File: src/Craned/Supervisor/TaskManager.cpp:1401-1407
Timestamp: 2025-12-08T08:11:40.323Z
Learning: In src/Craned/Supervisor/TaskManager.cpp, StepInstance::Prepare() is always called before any task's Spawn() method in the execution flow (via EvGrpcExecuteTaskCb_). If Prepare() fails, tasks are immediately finished and Spawn() is never invoked. For Crun tasks, StepInstance::Prepare() guarantees that x11_meta is set (even if x11 is false), so accessing x11_meta.value() in Spawn() when both IsCrun() and x11 are true is safe without additional has_value() checks.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.cpp
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-09T02:15:30.422Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/TaskScheduler.cpp:1713-1721
Timestamp: 2025-05-09T02:15:30.422Z
Learning: TaskScheduler中的资源回收设计:当Craned节点离线时,通过TaskStatusChangeAsync异步处理任务状态变更和资源回收。尽管可能存在短暂的资源更新延迟,但这是设计上可接受的权衡。

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-08-12T08:58:39.772Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 577
File: src/Craned/Supervisor/TaskManager.cpp:124-130
Timestamp: 2025-08-12T08:58:39.772Z
Learning: In the CraneSched project using C++23, Class Template Argument Deduction (CTAD) allows std::unique_ptr declarations without explicit template parameters when the type can be deduced from the initializer, such as `std::unique_ptr task = std::move(m_task_map_.at(task_id))` where the template parameter is deduced from the move operation.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-26T11:00:54.563Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 399
File: src/Craned/Craned/SupervisorKeeper.cpp:174-178
Timestamp: 2025-05-26T11:00:54.563Z
Learning: The CraneSched project uses C++23 standard, allowing the use of modern C++ features like std::ranges::to and other C++23 language features and library components.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-25T04:11:50.268Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 399
File: src/Craned/Supervisor/TaskManager.cpp:220-220
Timestamp: 2025-05-25T04:11:50.268Z
Learning: In TaskManager.cpp, when step->IsCrun() is checked before dynamic_cast to CrunMetaInExecution*, the cast is guaranteed to succeed due to the program logic ensuring the correct meta type is used for Crun tasks. Null checks for these dynamic_cast operations are unnecessary in this context.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-04-02T09:30:13.014Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 461
File: src/Craned/JobManager.cpp:141-149
Timestamp: 2025-04-02T09:30:13.014Z
Learning: In JobManager, if a uid exists in m_uid_to_job_ids_map_, its corresponding task_ids set is guaranteed to be non-empty due to the invariant maintained in AllocJobs and FreeJobs methods.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-09T01:54:21.256Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/CranedKeeper.cpp:51-53
Timestamp: 2025-05-09T01:54:21.256Z
Learning: The ConfigureCraned function in src/CraneCtld/CranedKeeper.cpp is called from a thread pool, so there's no need to worry about it blocking the gRPC completion queue thread.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-26T11:04:30.580Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 399
File: src/Craned/Craned/SupervisorKeeper.h:32-32
Timestamp: 2025-05-26T11:04:30.580Z
Learning: The Supervisor component in the Crane system is designed to manage only one task per instance. The task specification is provided to the Supervisor during startup by Craned (likely through InitSupervisorRequest), so the ExecuteTask() method doesn't need to accept task parameters since the Supervisor already knows which task to execute.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
🧬 Code graph analysis (1)
src/CraneCtld/TaskScheduler.cpp (1)
src/CraneCtld/TaskScheduler.h (1)
  • AddDependencyEvent (881-887)
🔇 Additional comments (7)
src/CraneCtld/CtldPublicDefs.cpp (3)

122-137: Dependency aggregation in DependenciesInJob::update looks correct

Using event_time + delay_seconds and aggregating via min/max for OR/AND, then erasing the satisfied entry, matches the model implied by is_met/is_failed and the scheduler’s dependency event queue. No change needed.


1006-1022: AddDependent / TriggerDependencyEvents integrate cleanly with the scheduler queue

Short‑circuiting AFTER when the dependee is already Running by immediately enqueueing a dependency event, and otherwise recording dependents per DependencyType, aligns with how ScheduleThread_ drains m_dependency_event_queue_ via UpdateDependency(). Assuming TaskInCtld lives only under a valid g_task_scheduler, this is fine.


1178-1199: TaskInfo.dependency_status serialization covers pending, finite, and ±∞ readiness

Conditioning emission on either non‑empty deps or a non‑InfinitePast ready_time, and then mapping ready_time to infinite_future / infinite_past flags or a concrete Unix seconds value, gives clients enough information to distinguish “no deps”, “waiting”, “ready at T”, and “never/always ready”. Looks good as-is.

src/CraneCtld/TaskScheduler.cpp (4)

399-487: Init-time dependency reconstruction and MongoDB resolution look coherent; verify is_failed() semantics

The recovery block correctly:

  • builds a dependee→dependents index from m_pending_task_map_,
  • attaches dependents to recovered pending/running tasks, and
  • for truly missing dependees, queries MongoDB and computes event_time per DependencyType before enqueuing AddDependencyEvent(job_id, dep_id, event_time).

This relies on DependenciesInJob::is_failed() treating “all events delivered and ready_time == absl::InfiniteFuture()” as permanently unsatisfiable; otherwise dependents of finished dependees with unsatisfied AFTER_OK / AFTER_NOT_OK could stay in "Dependency" rather than "DependencyNeverSatisfied". Please double-check that implementation matches this assumption.


843-859: Dependency event pump and gating in ScheduleThread_ are consistent

Draining m_dependency_event_queue_ under m_pending_task_map_mtx_, applying each event via it->second->UpdateDependency(event.dependee_job_id, event.event_time), and then gating scheduling on !Dependencies().is_met(now) with a Dependency / DependencyNeverSatisfied pending reason uses the new model as intended. Emitting AFTER events at job start (TriggerDependencyEvents(DependencyType::AFTER, StartTime())) plugs into the same pipeline cleanly.

Also applies to: 872-879, 1290-1292


2532-2535: Submission-time dependency registration and missing-dep rejection behaviour

CleanSubmitQueueCb_ now:

  • stores promises as std::promise<CraneExpected<task_id_t>>,
  • for each accepted task, wires its dependencies to existing pending dependees (under m_pending_task_map_mtx_) and then to running dependees (under m_running_task_map_mtx_), and
  • if any dependee is neither pending nor running, rejects the task with ERR_MISSING_DEPENDENCY, frees QoS, and schedules its DB entry for purge (tasks_to_purge).

Functionally this is sound and avoids leaving such tasks in a permanent “dependency” state. Just note the behavioural change: jobs depending on already-finished or unknown jobs now fail fast at submission instead of entering the queue and later becoming "DependencyNeverSatisfied". If that’s the intended UX for --dependency, it’s worth making explicit in the docs.

Also applies to: 2585-2586, 2592-2649, 2664-2666


2787-2799: Final-state dependency events correctly differentiate ANY / OK / NOT_OK

Using the gRPC timestamp to compute end_time, then:

  • always firing AFTER_ANY at end_time,
  • firing AFTER_OK at end_time only when exit_code == 0 (else InfiniteFuture), and
  • firing AFTER_NOT_OK at end_time only when exit_code != 0 (else InfiniteFuture),

aligns with the DependenciesInJob aggregation logic and gives predictable behaviour for dependents of completed/failed jobs.

@NamelessOIer NamelessOIer force-pushed the dev/dependency_re branch 2 times, most recently from cbb7326 to 3ef3692 Compare December 10, 2025 11:14
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/CraneCtld/RpcService/CtldGrpcServer.cpp (1)

442-468: Maintain alignment between task_id_list and code_list in SubmitBatchTasks

The success path appends only to task_id_list without appending to code_list, while error paths append to both. This breaks the 1:1 correspondence between the repeated fields defined in SubmitBatchTasksReply, making it impossible for clients to correlate task results with error codes when some or all tasks succeed.

Always append a code entry for each task:

  for (auto& res : results) {
    if (res.has_value()) {
      CraneExpected<task_id_t> task_result = res.value().get();
      if (task_result.has_value()) {
        response->mutable_task_id_list()->Add(task_result.value());
+       response->mutable_code_list()->Add(CraneErrCode::SUCCESS);
      } else {
        response->mutable_task_id_list()->Add(0);
        response->mutable_code_list()->Add(task_result.error());
      }
    } else {
      response->mutable_task_id_list()->Add(0);
      response->mutable_code_list()->Add(res.error());
    }
  }
♻️ Duplicate comments (1)
src/CraneCtld/TaskScheduler.cpp (1)

2515-2528: Cancelled pending tasks trigger incorrect dependency events

When pending tasks are cancelled, the code triggers both AFTER_OK and AFTER_NOT_OK with absl::InfiniteFuture():

task->TriggerDependencyEvents(crane::grpc::AFTER_OK,
                              absl::InfiniteFuture());
task->TriggerDependencyEvents(crane::grpc::AFTER_NOT_OK,
                              absl::InfiniteFuture());

This means dependents waiting on either AFTER_OK or AFTER_NOT_OK will never have their dependencies satisfied, causing them to remain stuck in "Dependency" state indefinitely.

Semantic issue: A cancelled job should be treated as a failure. Therefore:

  • AFTER_NOT_OK should fire with end_time (dependency satisfied at cancellation time)
  • AFTER_OK should fire with absl::InfiniteFuture() (will never be satisfied)

Additionally, no exit code is set before triggering events. Set a non-zero exit code (e.g., ExitCode::EC_TERMINATED) before triggering dependency events.

Apply this diff:

   task->SetStatus(crane::grpc::Cancelled);
   task->SetEndTime(end_time);
   g_account_meta_container->FreeQosResource(*task);
+  task->SetExitCode(ExitCode::EC_TERMINATED);

   task->TriggerDependencyEvents(crane::grpc::DependencyType::AFTER, end_time);
   task->TriggerDependencyEvents(crane::grpc::DependencyType::AFTER_ANY,
                                 end_time);
   task->TriggerDependencyEvents(crane::grpc::AFTER_OK,
-                                absl::InfiniteFuture());
+                                absl::InfiniteFuture());  // Never satisfied
   task->TriggerDependencyEvents(crane::grpc::AFTER_NOT_OK,
-                                absl::InfiniteFuture());
+                                end_time);  // Satisfied at cancel time
🧹 Nitpick comments (8)
docs/en/reference/pending_reason.md (1)

16-22: Add languages to fenced code blocks to satisfy markdownlint

The two “Example output” blocks use bare fenced code; markdownlint (MD040) flags this. Consider marking them as plain text:

-Example output:
-```
+Example output:
+```text
 JOBID    PARTITION  NAME     USER   ST   TIME     NODES  NODELIST(REASON)
 ...
-```
+```

...

-Example output:
-```
+Example output:
+```text
 JobId=101
 ...
 State=PENDING
 Reason=Priority
-```
+```

This keeps the rendering while making markdownlint happy.

Based on static analysis hints.

Also applies to: 31-36

docs/zh/reference/job_dependency.md (1)

27-29: Specify languages on plain fenced blocks to satisfy markdownlint

Two fenced blocks are missing a language (MD040):

  • Dependency syntax example
  • ccontrol show job output example

You can mark them as plain text:

-```
+```text
 <type>:<job_id>[+<delay>][:<job_id>[+<delay>]]...
-```
+```

...

-```
+```text
 JobId=105
 ...
 Dependency=PendingDependencies=afterok:100+01:00:00 Status=WaitForAll
-```
+```

This keeps rendering unchanged while clearing the lint warnings.

Based on static analysis hints.

Also applies to: 164-168

docs/en/reference/job_dependency.md (1)

27-29: Add languages to plain fenced blocks for markdownlint

The dependency syntax block and ccontrol show job output use bare ``` fences, which triggers MD040. Consider marking them as text:

-```
+```text
 <type>:<job_id>[+<delay>][:<job_id>[+<delay>]]...
-```
+```

...

-```
+```text
 JobId=105
 ...
 Dependency=PendingDependencies=afterok:100+01:00:00 Status=WaitForAll
-```
+```

This keeps the examples readable while satisfying markdownlint.

Based on static analysis hints.

Also applies to: 164-168

src/CraneCtld/DbClient.h (1)

122-125: Use a named type for the job-status tuple and confirm task_id_t width

The new API is clear in the comment, but the raw tuple type:

std::tuple<crane::grpc::TaskStatus, uint32_t, int64_t, int64_t>

is not self‑describing at call sites and is easy to misuse if the field order changes.

Consider introducing a named alias or small struct and using it in the map:

// Near other type aliases
using JobStatusInfo =
    std::tuple<crane::grpc::TaskStatus, uint32_t, int64_t, int64_t>;
// or a struct with named fields if you prefer

and then:

-  // Fetch job status (state, exit_code, time_end, time_start)
-  std::unordered_map<task_id_t, std::tuple<crane::grpc::TaskStatus, uint32_t,
-                                           int64_t, int64_t>>
+  // Fetch job status (state, exit_code, time_end, time_start)
+  std::unordered_map<task_id_t, JobStatusInfo>
   FetchJobStatus(const std::unordered_set<task_id_t>& job_ids);

Also, the implementation stores and queries task_id as int32_t. Please double‑check that task_id_t cannot exceed 32‑bit range; otherwise, you may want to store it as 64‑bit in Mongo to avoid silent truncation.

src/CraneCtld/CtldPublicDefs.cpp (4)

21-23: Including TaskScheduler here is reasonable

Pulling in TaskScheduler.h to support the new g_task_scheduler->AddDependencyEvent(...) calls from TaskInCtld is appropriate. If include depth ever becomes an issue, those calls could be wrapped behind a small helper in TaskScheduler to reduce coupling, but it’s not required now.


992-1007: Make SetDependency robust to repeated calls

SetDependency initializes is_or and ready_time based on the incoming Dependencies and then just appends/overwrites entries in dependencies.deps. If this method is ever called more than once on the same TaskInCtld instance (e.g., recovery/requeue paths), stale entries could remain in the map.

You can defensively reset the struct:

void TaskInCtld::SetDependency(const crane::grpc::Dependencies& grpc_deps) {
-  if (grpc_deps.is_or()) {
+  dependencies.deps.clear();
+
+  if (grpc_deps.is_or()) {
     dependencies.is_or = true;
     dependencies.ready_time = absl::InfiniteFuture();
   } else {
     dependencies.is_or = false;
     dependencies.ready_time = absl::InfinitePast();
   }
   for (const auto& dep : grpc_deps.deps()) {
     dependencies.deps[dep.job_id()] = {dep.type(), dep.delay_seconds()};
   }
}

This doesn’t change current behavior for first-time initialization but makes future reuse of TaskInCtld safer.


1020-1025: Consider clearing dependents after triggering dependency events

TriggerDependencyEvents iterates over dependents[dep_type] and calls AddDependencyEvent for each, but leaves the vector intact. If, for any reason, the same dependency event type is triggered multiple times for a job, dependents will receive redundant events.

You may want to clear the list after firing:

void TaskInCtld::TriggerDependencyEvents(
    const crane::grpc::DependencyType& dep_type, absl::Time event_time) {
-  for (task_id_t dependent_id : dependents[dep_type]) {
+  auto& list = dependents[dep_type];
+  for (task_id_t dependent_id : list) {
     g_task_scheduler->AddDependencyEvent(dependent_id, task_id, event_time);
   }
+  list.clear();
}

This is not strictly required if the scheduler guarantees each event type is emitted at most once per job, but it makes the behavior more robust.


1181-1202: DependencyStatus serialization matches in-memory model

The new block in SetFieldsOfTaskInfo:

  • emits dependency_status only when there are deps or a non-InfinitePast ready_time,
  • mirrors dependencies.is_or,
  • serializes every remaining pending dependency, and
  • encodes ready_time using the ready_time / infinite_future / infinite_past oneof.

This is consistent with how DependenciesInJob and SetDependency initialize and update ready_time, and should give clients enough information to display both pending dependencies and their effective ready time.

If you ever reuse a TaskInfo instance, consider clearing dependency_status first to avoid duplicate pending entries, though current usage likely always constructs a fresh message.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a05a02c and 3ef3692.

📒 Files selected for processing (19)
  • docs/en/command/calloc.md (1 hunks)
  • docs/en/command/cbatch.md (1 hunks)
  • docs/en/command/crun.md (1 hunks)
  • docs/en/reference/job_dependency.md (1 hunks)
  • docs/en/reference/pending_reason.md (1 hunks)
  • docs/zh/command/calloc.md (1 hunks)
  • docs/zh/command/cbatch.md (1 hunks)
  • docs/zh/command/crun.md (1 hunks)
  • docs/zh/reference/job_dependency.md (1 hunks)
  • docs/zh/reference/pending_reason.md (1 hunks)
  • mkdocs.yaml (2 hunks)
  • protos/PublicDefs.proto (4 hunks)
  • src/CraneCtld/CtldPublicDefs.cpp (5 hunks)
  • src/CraneCtld/CtldPublicDefs.h (4 hunks)
  • src/CraneCtld/DbClient.cpp (1 hunks)
  • src/CraneCtld/DbClient.h (1 hunks)
  • src/CraneCtld/RpcService/CtldGrpcServer.cpp (4 hunks)
  • src/CraneCtld/TaskScheduler.cpp (13 hunks)
  • src/CraneCtld/TaskScheduler.h (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • docs/zh/command/calloc.md
  • docs/en/command/crun.md
  • docs/zh/reference/pending_reason.md
  • src/CraneCtld/DbClient.cpp
  • src/CraneCtld/CtldPublicDefs.h
  • docs/en/command/cbatch.md
🧰 Additional context used
🧠 Learnings (14)
📚 Learning: 2025-05-09T01:54:21.256Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/CranedKeeper.cpp:51-53
Timestamp: 2025-05-09T01:54:21.256Z
Learning: The ConfigureCraned function in src/CraneCtld/CranedKeeper.cpp is called from a thread pool, so there's no need to worry about it blocking the gRPC completion queue thread.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-05-02T07:06:36.103Z
Learnt from: RileyWen
Repo: PKUHPC/CraneSched PR: 475
File: src/CraneCtld/CtldGrpcServer.cpp:113-118
Timestamp: 2025-05-02T07:06:36.103Z
Learning: In CraneSched, gRPC methods should generally return Status::OK even when handling error conditions, as non-OK statuses cause the connection to terminate. Error information should be communicated within the RPC response payload instead.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
📚 Learning: 2025-05-08T07:39:29.207Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/TaskManager.cpp:1162-1175
Timestamp: 2025-05-08T07:39:29.207Z
Learning: In CraneSched, when a task fails in the LaunchTaskInstanceMt_ method after PMIx registration, ActivateTaskStatusChangeAsync_ does not automatically release PMIx resources or cgroup resources. Explicit cleanup is required by calling g_pmix_server->DeregisterTask() and g_cg_mgr->ReleaseCgroupByTaskIdOnly() before returning from the function.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/CtldPublicDefs.cpp
  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-05-08T07:39:29.207Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/TaskManager.cpp:1162-1175
Timestamp: 2025-05-08T07:39:29.207Z
Learning: In CraneSched, when a task fails in the LaunchTaskInstanceMt_ method, ActivateTaskStatusChangeAsync_ does not automatically release PMIx resources or cgroup. Explicit cleanup is required by calling g_pmix_server->DeregisterTask() and g_cg_mgr->ReleaseCgroupByTaskIdOnly().

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/CtldPublicDefs.cpp
  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-06-23T07:53:30.513Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 520
File: src/Craned/CranedServer.cpp:50-51
Timestamp: 2025-06-23T07:53:30.513Z
Learning: In the CraneSched codebase, `g_ctld_client` is a guaranteed global variable that is always initialized before any gRPC service methods are called, so null pointer checks are not necessary when calling methods on it.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
📚 Learning: 2025-08-12T08:58:39.772Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 577
File: src/Craned/Supervisor/TaskManager.cpp:124-130
Timestamp: 2025-08-12T08:58:39.772Z
Learning: In the CraneSched project using C++23, Class Template Argument Deduction (CTAD) allows std::unique_ptr declarations without explicit template parameters when the type can be deduced from the initializer, such as `std::unique_ptr task = std::move(m_task_map_.at(task_id))` where the template parameter is deduced from the move operation.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-05-26T11:04:30.580Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 399
File: src/Craned/Craned/SupervisorKeeper.h:32-32
Timestamp: 2025-05-26T11:04:30.580Z
Learning: The Supervisor component in the Crane system is designed to manage only one task per instance. The task specification is provided to the Supervisor during startup by Craned (likely through InitSupervisorRequest), so the ExecuteTask() method doesn't need to accept task parameters since the Supervisor already knows which task to execute.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-06-30T08:43:44.470Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 537
File: src/Craned/TaskManager.cpp:145-147
Timestamp: 2025-06-30T08:43:44.470Z
Learning: In the CraneSched codebase, src/Craned/Craned.cpp guarantees that g_config.CranedRes.contains(g_config.CranedIdOfThisNode) through explicit validation during startup. The code checks if g_config.CranedRes.contains(g_config.Hostname) and exits the process if not found, then sets g_config.CranedIdOfThisNode = g_config.Hostname. TaskManager constructor is called after this validation, so g_config.CranedRes[g_config.CranedIdOfThisNode] is guaranteed to be valid.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/CtldPublicDefs.cpp
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-09T01:56:53.142Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/CtldPublicDefs.h:756-763
Timestamp: 2025-05-09T01:56:53.142Z
Learning: In the CraneSched codebase, the `execution_node` field in JobToD is intentionally set to the first element of `executing_craned_ids` vector without guards, as it represents the main execution node for a job. This is by design and assumes `executing_craned_ids` is never empty when `GetJobOfNode` is called.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.cpp
📚 Learning: 2025-12-08T08:11:40.323Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 727
File: src/Craned/Supervisor/TaskManager.cpp:1401-1407
Timestamp: 2025-12-08T08:11:40.323Z
Learning: In src/Craned/Supervisor/TaskManager.cpp, StepInstance::Prepare() is always called before any task's Spawn() method in the execution flow (via EvGrpcExecuteTaskCb_). If Prepare() fails, tasks are immediately finished and Spawn() is never invoked. For Crun tasks, StepInstance::Prepare() guarantees that x11_meta is set (even if x11 is false), so accessing x11_meta.value() in Spawn() when both IsCrun() and x11 are true is safe without additional has_value() checks.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.cpp
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-09T02:15:30.422Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/TaskScheduler.cpp:1713-1721
Timestamp: 2025-05-09T02:15:30.422Z
Learning: TaskScheduler中的资源回收设计:当Craned节点离线时,通过TaskStatusChangeAsync异步处理任务状态变更和资源回收。尽管可能存在短暂的资源更新延迟,但这是设计上可接受的权衡。

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-05-26T11:00:54.563Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 399
File: src/Craned/Craned/SupervisorKeeper.cpp:174-178
Timestamp: 2025-05-26T11:00:54.563Z
Learning: The CraneSched project uses C++23 standard, allowing the use of modern C++ features like std::ranges::to and other C++23 language features and library components.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-04-02T09:30:13.014Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 461
File: src/Craned/JobManager.cpp:141-149
Timestamp: 2025-04-02T09:30:13.014Z
Learning: In JobManager, if a uid exists in m_uid_to_job_ids_map_, its corresponding task_ids set is guaranteed to be non-empty due to the invariant maintained in AllocJobs and FreeJobs methods.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-08T09:35:39.809Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/Pmix/PmixCollRing.cpp:0-0
Timestamp: 2025-05-08T09:35:39.809Z
Learning: In the PMIx implementation for CraneSched, objects referenced in asynchronous gRPC callbacks (like `coll_ring_ctx`) remain valid as long as the parent object (`this`) is not destroyed. The `Coll` class uses shared_ptr management to ensure its lifetime extends through all pending callbacks.

Applied to files:

  • src/CraneCtld/TaskScheduler.h
🧬 Code graph analysis (2)
src/CraneCtld/DbClient.h (1)
src/CraneCtld/DbClient.cpp (2)
  • FetchJobStatus (498-548)
  • FetchJobStatus (500-500)
src/CraneCtld/TaskScheduler.cpp (1)
src/CraneCtld/TaskScheduler.h (1)
  • AddDependencyEvent (881-887)
🪛 LanguageTool
docs/zh/reference/job_dependency.md

[uncategorized] ~69-~69: 数词与名词之间一般应存在量词,可能缺少量词。
Context: ...100 开始 并且 作业 101 成功完成。 #### OR 逻辑(任一条件满足) 使用 ? 分隔不同的依赖条件: ```bash afterok:...

(wa5)

🪛 markdownlint-cli2 (0.18.1)
docs/en/reference/job_dependency.md

27-27: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


164-164: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

docs/en/reference/pending_reason.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


31-31: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

docs/zh/reference/job_dependency.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


31-31: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (11)
docs/zh/command/cbatch.md (1)

72-72: Dependency option description looks good

The -d/--dependency syntax and listed types are consistent with the dedicated Job Dependency reference; no changes needed here.

mkdocs.yaml (1)

85-86: Nav entries and translations for Job Dependency/Pending Reason look correct

New reference pages and their Chinese nav translations are wired consistently with existing entries; no further changes needed.

Also applies to: 124-125

docs/zh/command/crun.md (1)

15-15: Dependency option documentation is consistent

The -d/--dependency description matches the cbatch docs and the Job Dependency reference page, so the CLI behavior is documented consistently.

docs/en/command/calloc.md (1)

37-37: Dependency option text aligns with reference documentation

The -d, --dependency option syntax and supported types align with the Job Dependency reference and other command docs; no changes needed.

protos/PublicDefs.proto (4)

135-162: Dependency proto definitions are consistent and minimal

The new DependencyType, DependencyCondition, Dependencies, and DependencyStatus messages form a clean, orthogonal model that matches the usage in CtldPublicDefs (is_or + per-dependency delay + a single ready_time/∞ flag). No structural issues from a proto/generation standpoint.


186-187: Confirm that field number 15 was previously unused in TaskToCtld

Using Dependencies dependencies = 15; is fine as long as tag 15 wasn’t populated by an older field in already-deployed clusters. If it was, old clients/servers would misinterpret the wire format.

If you’re unsure, please double-check the git history of TaskToCtld (or generated descriptors in a running deployment) to ensure 15 was never used before this change.


482-486: TaskInfo.dependency_status wiring matches implementation

Adding optional DependencyStatus dependency_status = 29; lines up with TaskInCtld::SetFieldsOfTaskInfo, which only populates the field when there are dependencies or a meaningful ready_time. This is backward compatible (higher tag, optional) and won’t affect older clients ignoring the new field.


730-731: New error codes integrate cleanly

ERR_MISSING_DEPENDENCY = 84 and ERR_DB_INSERT_FAILED = 85 are appended at the end of ErrCode, preserving existing numeric assignments. This keeps wire compatibility while making dependency-specific failures distinguishable.

src/CraneCtld/RpcService/CtldGrpcServer.cpp (2)

300-313: Nested CraneExpected unwrapping in CforedStream is correct

The conversion from CraneExpected<std::future<CraneExpected<task_id_t>>> to std::expected<task_id_t, std::string> correctly:

  • unwraps the future,
  • propagates the inner CraneErrCode via CraneErrStr,
  • and falls back to the outer error if submission itself failed.
    This preserves previous semantics while aligning with the new return type.

406-415: SubmitBatchTask now surfaces per-task CraneErrCode from async path

The updated logic unwraps the inner CraneExpected<task_id_t> and:

  • sets ok=true and task_id on success,
  • otherwise sets ok=false and assigns response->code from task_result.error(),
    falling back to the outer result.error() if scheduling failed before the future was created. This looks consistent with the new scheduler API.
src/CraneCtld/CtldPublicDefs.cpp (1)

1086-1087: Wiring TaskToCtld.dependencies into TaskInCtld is appropriate

Calling SetDependency(val.dependencies()); from SetFieldsByTaskToCtld ensures the in-memory dependency state is always derived from the protobuf submission payload. This ties the proto and internal representation together cleanly.

Comment on lines +122 to +137
void DependenciesInJob::update(task_id_t job_id, absl::Time event_time) {
auto it = deps.find(job_id);
if (it == deps.end()) {
CRANE_ERROR("Dependency for job {} not found", job_id);
return;
}
const auto& [dep_type, delay_seconds] = it->second;

absl::Time dep_ready_time = event_time + absl::Seconds(delay_seconds);
if (is_or) {
ready_time = std::min(ready_time, dep_ready_time);
} else {
ready_time = std::max(ready_time, dep_ready_time);
}
deps.erase(it);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unused dep_type in DependenciesInJob::update

dep_type from it->second is never used, which will trigger -Wunused-variable warnings:

const auto& [dep_type, delay_seconds] = it->second;

Consider dropping it or marking it intentionally unused, e.g.:

-  const auto& [dep_type, delay_seconds] = it->second;
+  const auto& [/*dep_type*/, delay_seconds] = it->second;

or by using std::get<1>(it->second) instead.

🤖 Prompt for AI Agents
In src/CraneCtld/CtldPublicDefs.cpp around lines 122 to 137, the structured
binding captures dep_type which is never used and triggers -Wunused-variable;
fix by omitting or marking it unused — either replace the binding with
extracting only the second element (e.g. use std::get<1>(it->second) to obtain
delay_seconds) or keep the structured binding but mark the first element
intentionally unused (e.g. [[maybe_unused]] or a named underscore) so only
delay_seconds is used when computing dep_ready_time.

Comment on lines 1009 to 1197
void TaskInCtld::AddDependent(crane::grpc::DependencyType dep_type,
task_id_t dep_job_id) {
if (dep_type == crane::grpc::DependencyType::AFTER &&
status == crane::grpc::TaskStatus::Running) {
// already satisfied
g_task_scheduler->AddDependencyEvent(dep_job_id, task_id, start_time);
} else {
dependents[dep_type].push_back(dep_job_id);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and read the AddDependent method with surrounding context
find . -name "CtldPublicDefs.cpp" -type f

Repository: PKUHPC/CraneSched

Length of output: 95


🏁 Script executed:

# Read the file around the specified lines to understand the full context
cat -n src/CraneCtld/CtldPublicDefs.cpp | sed -n '1000,1030p'

Repository: PKUHPC/CraneSched

Length of output: 1361


🏁 Script executed:

# Search for TaskStatus enum to understand possible status values
rg "enum.*TaskStatus" --type cpp -A 20

Repository: PKUHPC/CraneSched

Length of output: 43


🏁 Script executed:

# Look for TriggerDependencyEvents to understand when dependencies are released
rg "TriggerDependencyEvents" --type cpp -B 2 -A 5

Repository: PKUHPC/CraneSched

Length of output: 3966


🏁 Script executed:

# Check how start_time is initialized in TaskInCtld
rg "start_time" --type cpp | grep -i "init\|=" | head -20

Repository: PKUHPC/CraneSched

Length of output: 1711


🏁 Script executed:

# Search for HasStarted or similar methods in TaskInCtld
rg "HasStarted\|start_time\s*!=\s*0\|start_time.*valid" --type cpp -B 2 -A 2

Repository: PKUHPC/CraneSched

Length of output: 43


🏁 Script executed:

# Look for TaskStatus enum definition
find . -name "*.proto" -type f | head -10

Repository: PKUHPC/CraneSched

Length of output: 178


🏁 Script executed:

# Search for TaskStatus enum in proto files
rg "enum.*TaskStatus" --type proto -A 15

Repository: PKUHPC/CraneSched

Length of output: 90


🏁 Script executed:

# Look for TaskInCtld constructor to see how start_time is initialized
rg "TaskInCtld::" --type cpp | grep -i construct

Repository: PKUHPC/CraneSched

Length of output: 43


🏁 Script executed:

# Read TaskInCtld class definition to understand initialization
cat -n src/CraneCtld/CtldPublicDefs.h | head -100

Repository: PKUHPC/CraneSched

Length of output: 4192


🏁 Script executed:

# Search for start_time initialization in TaskInCtld
rg "start_time\s*=" --type cpp | grep -i "ctldpublic\|taskinctld" -A 2 -B 2

Repository: PKUHPC/CraneSched

Length of output: 658


🏁 Script executed:

# Look for how status transitions happen during job lifecycle
rg "status\s*=" --type cpp | grep -i "running\|completed\|failed" | head -15

Repository: PKUHPC/CraneSched

Length of output: 718


🏁 Script executed:

# Check the complete context around where TriggerDependencyEvents(AFTER) is called at job start
rg "TriggerDependencyEvents.*AFTER" --type cpp -B 10 -A 3

Repository: PKUHPC/CraneSched

Length of output: 3571


🏁 Script executed:

# Search for TaskStatus in proto files (use text search)
rg "TaskStatus" protos/PublicDefs.proto -A 20

Repository: PKUHPC/CraneSched

Length of output: 3394


🏁 Script executed:

# Look for TaskInCtld class declaration to find member initialization
rg "class TaskInCtld" --type cpp -A 80

Repository: PKUHPC/CraneSched

Length of output: 43


🏁 Script executed:

# Search for where start_time is first initialized in TaskInCtld
rg "start_time.*=" src/CraneCtld/CtldPublicDefs.cpp | head -20

Repository: PKUHPC/CraneSched

Length of output: 256


🏁 Script executed:

# Look for the full TaskInCtld member declarations
cat -n src/CraneCtld/CtldPublicDefs.h | sed -n '1,300p' | tail -200

Repository: PKUHPC/CraneSched

Length of output: 6430


🏁 Script executed:

# Check if there's any code that re-processes already-finished jobs for dependencies
rg "AddDependent" --type cpp -B 5 -A 5

Repository: PKUHPC/CraneSched

Length of output: 4529


🏁 Script executed:

# Look for TaskInCtld class definition and member variables
rg "class TaskInCtld" --type cpp -A 100 | head -120

Repository: PKUHPC/CraneSched

Length of output: 43


🏁 Script executed:

# Search for start_time initialization or default value
rg "absl::Time.*start_time" src/CraneCtld/CtldPublicDefs.h -A 2 -B 2

Repository: PKUHPC/CraneSched

Length of output: 1018


🏁 Script executed:

# Look for HasStarted or similar method in TaskInCtld
rg "HasStarted|has.*started" --type cpp -i

Repository: PKUHPC/CraneSched

Length of output: 43


🏁 Script executed:

# Find the constructor or initialization of TaskInCtld to see default start_time
rg "TaskInCtld::" src/CraneCtld/CtldPublicDefs.cpp | head -20

Repository: PKUHPC/CraneSched

Length of output: 1078


🏁 Script executed:

# Look at the else clause handling missing dependees
rg "missing_dependee_ids" src/CraneCtld/TaskScheduler.cpp -B 10 -A 10

Repository: PKUHPC/CraneSched

Length of output: 1997


🏁 Script executed:

# Check how missing dependencies are handled
rg "missing_deps\|missing_dependee" src/CraneCtld/TaskScheduler.cpp -B 5 -A 5

Repository: PKUHPC/CraneSched

Length of output: 43


🏁 Script executed:

# Get more context on the MongoDB fetch path for handling finished dependencies
rg "missing_dependee_ids" src/CraneCtld/TaskScheduler.cpp -A 40 | head -70

Repository: PKUHPC/CraneSched

Length of output: 2864


🏁 Script executed:

# Look at the complete switch statement for AFTER handling
rg "case crane::grpc::DependencyType::AFTER" src/CraneCtld/TaskScheduler.cpp -A 20

Repository: PKUHPC/CraneSched

Length of output: 1718


🏁 Script executed:

# Check if start_time has a default/initialization value
cat -n src/CraneCtld/CtldPublicDefs.cpp | grep -A 5 "m_start_time_"

Repository: PKUHPC/CraneSched

Length of output: 910


🏁 Script executed:

# Look for where start_time gets set to understand its default value
rg "SetStartTime|m_start_time_\s*=" src/CraneCtld/CtldPublicDefs.cpp | head -10

Repository: PKUHPC/CraneSched

Length of output: 400


🏁 Script executed:

# Look for TaskInCtld constructor to see if start_time has a default value
rg "TaskInCtld\(" src/CraneCtld/CtldPublicDefs.cpp -A 20 | head -40

Repository: PKUHPC/CraneSched

Length of output: 43


🏁 Script executed:

# Check how status transitions are set during job lifecycle
rg "SetStatus.*Running\|SetStatus.*Completed" src/CraneCtld/TaskScheduler.cpp -B 2 -A 2 | head -40

Repository: PKUHPC/CraneSched

Length of output: 43


🏁 Script executed:

# Verify when TriggerDependencyEvents(AFTER) is actually called
rg "TriggerDependencyEvents.*AFTER" src/CraneCtld/TaskScheduler.cpp -B 5 -A 2

Repository: PKUHPC/CraneSched

Length of output: 1356


AFTER dependencies on already-finished jobs may never be released

TaskInCtld::AddDependent treats an AFTER dependency as "already satisfied" only when status == Running. If a dependent job is created after the dependee has finished (Completed/Failed/Cancelled), status will no longer be Running, so the dependent ID is stored in dependents[AFTER]. Since TriggerDependencyEvents(AFTER, ...) is called at job start and completion (both before the dependent is added), the dependency may remain permanently unsatisfied.

This is realistic: users can submit --dependency=after:<finished_job_id> before the finished job is archived to the database.

The existing code in the MongoDB fetch path (for archived jobs) correctly handles this case by checking if (time_start != 0) instead of status, indicating that the correct fix is to check whether the job has ever started, not whether it is currently running. Consider applying the same logic in AddDependent:

 void TaskInCtld::AddDependent(crane::grpc::DependencyType dep_type,
                               task_id_t dep_job_id) {
-  if (dep_type == crane::grpc::DependencyType::AFTER &&
-      status == crane::grpc::TaskStatus::Running) {
-    // already satisfied
-    g_task_scheduler->AddDependencyEvent(dep_job_id, task_id, start_time);
-  } else {
-    dependents[dep_type].push_back(dep_job_id);
-  }
+  if (dep_type == crane::grpc::DependencyType::AFTER &&
+      start_time != absl::InfiniteFuture()) {
+    // already satisfied (job has started at some point)
+    g_task_scheduler->AddDependencyEvent(dep_job_id, task_id, start_time);
+  } else {
+    dependents[dep_type].push_back(dep_job_id);
+  }
 }
🤖 Prompt for AI Agents
In src/CraneCtld/CtldPublicDefs.cpp around lines 1009 to 1017, the code treats
AFTER dependencies as satisfied only when status == Running, which misses
dependents added after a job has already started and finished; change the
condition to check whether the job has ever started (start_time != 0) and, if
so, call g_task_scheduler->AddDependencyEvent(dep_job_id, task_id, start_time)
immediately; otherwise, append the dep_job_id to dependents[dep_type] as before.

@github-actions github-actions bot added the test-passed Build and test success label Dec 10, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/CraneCtld/RpcService/CtldGrpcServer.cpp (1)

507-534: Fix alignment inconsistency between task_id_list and code_list in SubmitBatchTasks

The current implementation treats code_list inconsistently:

  • Early error case (line 502–503): Adds entries to both task_id_list and code_list
  • Success case (line 525): Adds to task_id_list only
  • Error cases in loop (lines 528, 532): Adds 0 to task_id_list and error code to code_list

This creates misaligned lists. Either:

  1. Make lists always 1:1 aligned by appending SUCCESS (or a success code) whenever a task succeeds, or
  2. Keep code_list sparse (errors only) and handle the early error case consistently with the loop

The ErrCode::SUCCESS = 0 enum value suggests alignment may be intended.

♻️ Duplicate comments (1)
src/CraneCtld/DbClient.cpp (1)

693-743: Guard FetchJobStatus against step-only documents by filtering on has_job_info

The query currently matches any document with a task_id in the $in list, including step-only documents created by InsertSteps that lack state/exit_code/time_start/time_end. Accessing those fields will throw, hit the catch, and you’ll return a map indistinguishable from “no such jobs”.

To align with FetchJobRecords() and avoid silently dropping those entries, reuse the has_job_info marker so only full job docs are selected:

-  try {
-    document filter;
-    filter.append(kvp("task_id", [&job_ids](sub_document task_id_doc) {
+  try {
+    document filter;
+    // Only query documents that contain full job information.
+    filter.append(kvp("has_job_info", true));
+    filter.append(kvp("task_id", [&job_ids](sub_document task_id_doc) {
       array task_id_array;
       for (const auto& job_id : job_ids) {
         task_id_array.append(static_cast<std::int32_t>(job_id));
       }
       task_id_doc.append(kvp("$in", task_id_array));
     }));

This keeps the projection and field reads unchanged but avoids exceptions and makes “no result” genuinely mean “no full job record”.

🧹 Nitpick comments (10)
docs/zh/command/calloc.md (1)

36-41: Clarify --dependency syntax or reference job_dependency doc

The brief grammar here is slightly simplified compared to docs/zh/reference/job_dependency.md (e.g., delay on subsequent job_ids). Not wrong, but it may confuse users when they see more complete forms elsewhere.

Consider either:

  • Explicitly stating this is only a simplified overview, or
  • Linking to “作业依赖 (Job Dependency)” for full syntax and examples instead of repeating partial grammar here.
docs/zh/command/crun.md (1)

13-21: Align --dependency help text with central Job Dependency reference

The inline grammar description is correct but abbreviated relative to docs/zh/reference/job_dependency.md (which also covers delay placement, AND/OR semantics, error cases, etc.).

To avoid divergence over time, consider trimming this to a short description and pointing readers to the “作业依赖 (Job Dependency)” reference page for full syntax and examples.

src/CraneCtld/CtldPublicDefs.h (2)

429-445: Dependency state modelling looks coherent; consider edge‑case semantics doc

DependenciesInJob::is_met / is_failed combined with update and ready_time give clear AND/OR semantics and a way to mark “never satisfied” via absl::InfiniteFuture(). This matches how TaskScheduler now uses InfiniteFuture for failed/missing dependencies.

Two follow‑ups I recommend:

  • The deps map is currently public and is accessed directly from TaskScheduler. Encapsulating it (e.g., exposing a ForEachDep or getter returning const& only) would make future invariants around ready_time and is_or safer to maintain.
  • The interaction between “failed dependency” and ready_time == InfiniteFuture() is subtle. A short comment near is_failed() or update() clarifying that InfiniteFuture means “dependency can never be satisfied” (and thus leads to DependencyNeverSatisfied) would help future maintainers reason about the design.

762-776: TaskInCtld dependency fields fit the design

The per‑type dependents[DependencyType_ARRAYSIZE] and the embedded DependenciesInJob dependencies; provide exactly what TaskScheduler needs (both outgoing deps and reverse edges). Layout (in [2]/[3] persisted section) is consistent with other persisted runtime fields.

No functional issues spotted here. You may later want a brief comment explaining that dependents is not persisted (only in‑RAM wiring), while dependencies is persisted via TaskToCtld / DB, but that’s optional.

src/CraneCtld/TaskScheduler.cpp (5)

398-487: Startup dependency reconstruction is sound; consider logging for ambiguous AFTER cases

The recovery block that rebuilds dependents and seeds dependency events from MongoDB does the right things:

  • Rewires dependents across pending/running tasks.
  • Collects missing dependees into missing_dependee_ids and bulk‑fetches their final (status, exit_code, time_end, time_start).
  • Emits per‑type events with sensible fallbacks:
    • AFTER: prefers time_start, falls back to time_end for Cancelled, logs if neither exists.
    • AFTER_ANY/AFTER_OK/AFTER_NOT_OK: use time_end with guards on status/exit_code.
    • Missing DB row ⇒ warn and leave event_time as InfiniteFuture (so the dependent will eventually be marked DependencyNeverSatisfied).

Two minor improvements you might consider:

  • For AFTER on a completed (non‑cancelled) job with time_start == 0 but time_end != 0, current code logs an error and never sets event_time, effectively treating it as “never satisfied”. If this is expected, a more explicit log message (“treating dependency as unsatisfiable”) would be clearer; otherwise, falling back to time_end might be more user‑friendly.
  • You already batch DB access per missing_dependee_ids; good. If dependencies grow large, a brief debug trace summarizing how many jobs ended up with InfiniteFuture timestamps could help debugging “DependencyNeverSatisfied” states.

Overall logic is correct; suggestions are optional.


1678-1745: Two‑layer CraneExpected + future return type is a bit heavy but consistent

SubmitTaskToScheduler now returns CraneExpected<std::future<CraneExpected<task_id_t>>>, so:

  • Outer CraneExpected captures synchronous validation failures (UID, account, partition, QOS, basic validity).
  • Inner CraneExpected (inside the future) reports asynchronous persistence/queueing failures or delivers the allocated task_id.

This separation is logically clean and keeps call‑sites explicit about what failed when. It is a bit verbose for callers but consistent with your error‑handling style; no functional issues.

You might eventually want a thin helper to unwrap this pattern at the RPC layer, but not required in this PR.


2882-2940: Dependency wiring on submission is mostly correct; edge cases handled; minor suggestions

The new block in CleanSubmitQueueCb_ that wires dependents at submission time:

  • For each accepted task:
    • Iterates its job->Dependencies().deps.
    • If dependee is in m_pending_task_map_, calls AddDependent(dep_type, id).
    • Else pushes (dep_type, dep_job_id) into running_deps.
  • Then under m_running_task_map_mtx_:
    • Adds dependents for any dependees found in m_running_task_map_.
    • Flags missing_deps if any dependency is neither pending nor running.
  • On missing_deps:
    • Logs a warning.
    • Frees QoS resources for the new job.
    • Fulfills its promise with ERR_MISSING_DEPENDENCY.
    • Records (task_id, TaskDbId) in tasks_to_purge.
  • After the loop:
    • Updates m_pending_map_cached_size_ and calls PurgeEndedTasks(tasks_to_purge).

This is a solid design:

  • Prevents dangling dependencies on non‑existent jobs.
  • Cleans up embedded DB entries for rejected submissions.
  • Avoids holding the running‑map lock longer than needed.

Two small suggestions:

  • The name ERR_BEYOND_TASK_ID used later for capacity rejection is slightly cryptic; a more explicit code like ERR_PENDING_QUEUE_FULL would be clearer (if you’re free to add new error codes).
  • You might want to log the offending dependency IDs when a job is rejected for missing dependencies to help users debug typos in --dependency job IDs.

Functionally, this looks good.


2951-2956: Capacity‑rejection now returns structured errors correctly

For tasks rejected because the pending queue is beyond capacity:

  • QoS resources are freed.
  • The promise is set to std::unexpected(CraneErrCode::ERR_BEYOND_TASK_ID).

This matches the new CraneExpected usage and prevents resource leaks. Only nit is the error code naming as noted above; semantics are correct.


3135-3158: End‑time normalization and dependency events on job completion are well‑designed

On finalization of a job in CleanTaskStatusChangeQueueCb_ you now:

  • Derive end_time from the reported timestamp.
  • Clamp it to StartTime + time_limit + 5 seconds tolerance to handle clock skew / delayed status.
  • Call SetEndTime(end_time).
  • Emit dependency events:
    • AFTER_ANY at end_time.
    • AFTER_OK at end_time if exit_code == 0, else at InfiniteFuture().
    • AFTER_NOT_OK at end_time if exit_code != 0, else at InfiniteFuture().

This integrates cleanly with DependenciesInJob:

  • Satisfied dependencies get a real timestamp.
  • Unsatisfied ones get InfiniteFuture, leading to DependencyNeverSatisfied if/when all deps resolve that way.

The only minor concern is relying solely on exit_code (ignoring status) for OK vs NOT_OK; if you ever introduce special non‑0 exit codes that should still count as “OK”, you’ll need to adjust this mapping. For current semantics, it’s fine.

src/CraneCtld/TaskScheduler.h (1)

782-783: SubmitTaskToScheduler’s return type is explicit and consistent

Declaring CraneExpected<std::future<CraneExpected<task_id_t>>> SubmitTaskToScheduler(...) in the header matches the implementation in the .cpp. The signature is admittedly a bit nested, but it cleanly separates synchronous validation from asynchronous persistence/queuing failures.

If you find this pattern leaks into many call‑sites, a small wrapper type (e.g., using SubmitTaskResult = CraneExpected<std::future<CraneExpected<task_id_t>>>;) could improve readability, but not required.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3ef3692 and a4e0183.

📒 Files selected for processing (19)
  • docs/en/command/calloc.md (1 hunks)
  • docs/en/command/cbatch.md (1 hunks)
  • docs/en/command/crun.md (1 hunks)
  • docs/en/reference/job_dependency.md (1 hunks)
  • docs/en/reference/pending_reason.md (1 hunks)
  • docs/zh/command/calloc.md (1 hunks)
  • docs/zh/command/cbatch.md (1 hunks)
  • docs/zh/command/crun.md (1 hunks)
  • docs/zh/reference/job_dependency.md (1 hunks)
  • docs/zh/reference/pending_reason.md (1 hunks)
  • mkdocs.yaml (2 hunks)
  • protos/PublicDefs.proto (4 hunks)
  • src/CraneCtld/CtldPublicDefs.cpp (5 hunks)
  • src/CraneCtld/CtldPublicDefs.h (4 hunks)
  • src/CraneCtld/DbClient.cpp (1 hunks)
  • src/CraneCtld/DbClient.h (1 hunks)
  • src/CraneCtld/RpcService/CtldGrpcServer.cpp (4 hunks)
  • src/CraneCtld/TaskScheduler.cpp (13 hunks)
  • src/CraneCtld/TaskScheduler.h (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • docs/en/command/crun.md
  • docs/zh/command/cbatch.md
  • docs/en/command/cbatch.md
  • docs/en/command/calloc.md
  • src/CraneCtld/CtldPublicDefs.cpp
  • src/CraneCtld/DbClient.h
  • mkdocs.yaml
  • docs/en/reference/job_dependency.md
🧰 Additional context used
🧠 Learnings (18)
📓 Common learnings
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 399
File: src/Craned/Craned/SupervisorKeeper.cpp:174-178
Timestamp: 2025-05-26T11:00:54.563Z
Learning: The CraneSched project uses C++23 standard, allowing the use of modern C++ features like std::ranges::to and other C++23 language features and library components.
📚 Learning: 2025-05-09T01:54:21.256Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/CranedKeeper.cpp:51-53
Timestamp: 2025-05-09T01:54:21.256Z
Learning: The ConfigureCraned function in src/CraneCtld/CranedKeeper.cpp is called from a thread pool, so there's no need to worry about it blocking the gRPC completion queue thread.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-05-02T07:06:36.103Z
Learnt from: RileyWen
Repo: PKUHPC/CraneSched PR: 475
File: src/CraneCtld/CtldGrpcServer.cpp:113-118
Timestamp: 2025-05-02T07:06:36.103Z
Learning: In CraneSched, gRPC methods should generally return Status::OK even when handling error conditions, as non-OK statuses cause the connection to terminate. Error information should be communicated within the RPC response payload instead.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
📚 Learning: 2025-06-23T07:53:30.513Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 520
File: src/Craned/CranedServer.cpp:50-51
Timestamp: 2025-06-23T07:53:30.513Z
Learning: In the CraneSched codebase, `g_ctld_client` is a guaranteed global variable that is always initialized before any gRPC service methods are called, so null pointer checks are not necessary when calling methods on it.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
📚 Learning: 2025-05-08T07:39:29.207Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/TaskManager.cpp:1162-1175
Timestamp: 2025-05-08T07:39:29.207Z
Learning: In CraneSched, when a task fails in the LaunchTaskInstanceMt_ method after PMIx registration, ActivateTaskStatusChangeAsync_ does not automatically release PMIx resources or cgroup resources. Explicit cleanup is required by calling g_pmix_server->DeregisterTask() and g_cg_mgr->ReleaseCgroupByTaskIdOnly() before returning from the function.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-05-26T11:04:30.580Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 399
File: src/Craned/Craned/SupervisorKeeper.h:32-32
Timestamp: 2025-05-26T11:04:30.580Z
Learning: The Supervisor component in the Crane system is designed to manage only one task per instance. The task specification is provided to the Supervisor during startup by Craned (likely through InitSupervisorRequest), so the ExecuteTask() method doesn't need to accept task parameters since the Supervisor already knows which task to execute.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-12-08T08:11:40.332Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 727
File: src/Craned/Supervisor/TaskManager.cpp:1401-1407
Timestamp: 2025-12-08T08:11:40.332Z
Learning: In src/Craned/Supervisor/TaskManager.cpp, StepInstance::Prepare() is always called before any task's Spawn() method in the execution flow (via EvGrpcExecuteTaskCb_). If Prepare() fails, tasks are immediately finished and Spawn() is never invoked. For Crun tasks, StepInstance::Prepare() guarantees that x11_meta is set (even if x11 is false), so accessing x11_meta.value() in Spawn() when both IsCrun() and x11 are true is safe without additional has_value() checks.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
📚 Learning: 2025-08-12T08:58:39.772Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 577
File: src/Craned/Supervisor/TaskManager.cpp:124-130
Timestamp: 2025-08-12T08:58:39.772Z
Learning: In the CraneSched project using C++23, Class Template Argument Deduction (CTAD) allows std::unique_ptr declarations without explicit template parameters when the type can be deduced from the initializer, such as `std::unique_ptr task = std::move(m_task_map_.at(task_id))` where the template parameter is deduced from the move operation.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/CtldPublicDefs.h
  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-06-30T08:43:44.470Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 537
File: src/Craned/TaskManager.cpp:145-147
Timestamp: 2025-06-30T08:43:44.470Z
Learning: In the CraneSched codebase, src/Craned/Craned.cpp guarantees that g_config.CranedRes.contains(g_config.CranedIdOfThisNode) through explicit validation during startup. The code checks if g_config.CranedRes.contains(g_config.Hostname) and exits the process if not found, then sets g_config.CranedIdOfThisNode = g_config.Hostname. TaskManager constructor is called after this validation, so g_config.CranedRes[g_config.CranedIdOfThisNode] is guaranteed to be valid.

Applied to files:

  • src/CraneCtld/RpcService/CtldGrpcServer.cpp
  • src/CraneCtld/CtldPublicDefs.h
  • src/CraneCtld/TaskScheduler.cpp
📚 Learning: 2025-05-23T02:32:43.952Z
Learnt from: 1daidai1
Repo: PKUHPC/CraneSched PR: 458
File: src/CraneCtld/CtldPublicDefs.h:0-0
Timestamp: 2025-05-23T02:32:43.952Z
Learning: In the CraneSched project, allocated_res_view is handled/updated separately before calling SetAllocatedRes, so it does not need to be updated again within the method itself.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.h
📚 Learning: 2025-05-23T02:32:43.952Z
Learnt from: 1daidai1
Repo: PKUHPC/CraneSched PR: 458
File: src/CraneCtld/CtldPublicDefs.h:0-0
Timestamp: 2025-05-23T02:32:43.952Z
Learning: In the CraneSched project, allocated_res_view is updated before calling SetAllocatedRes, so it doesn't need to be updated again within the method itself.

Applied to files:

  • src/CraneCtld/CtldPublicDefs.h
📚 Learning: 2025-05-09T02:15:30.422Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/TaskScheduler.cpp:1713-1721
Timestamp: 2025-05-09T02:15:30.422Z
Learning: TaskScheduler中的资源回收设计:当Craned节点离线时,通过TaskStatusChangeAsync异步处理任务状态变更和资源回收。尽管可能存在短暂的资源更新延迟,但这是设计上可接受的权衡。

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-05-08T07:39:29.207Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/TaskManager.cpp:1162-1175
Timestamp: 2025-05-08T07:39:29.207Z
Learning: In CraneSched, when a task fails in the LaunchTaskInstanceMt_ method, ActivateTaskStatusChangeAsync_ does not automatically release PMIx resources or cgroup. Explicit cleanup is required by calling g_pmix_server->DeregisterTask() and g_cg_mgr->ReleaseCgroupByTaskIdOnly().

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-05-26T11:00:54.563Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 399
File: src/Craned/Craned/SupervisorKeeper.cpp:174-178
Timestamp: 2025-05-26T11:00:54.563Z
Learning: The CraneSched project uses C++23 standard, allowing the use of modern C++ features like std::ranges::to and other C++23 language features and library components.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-04-02T09:30:13.014Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 461
File: src/Craned/JobManager.cpp:141-149
Timestamp: 2025-04-02T09:30:13.014Z
Learning: In JobManager, if a uid exists in m_uid_to_job_ids_map_, its corresponding task_ids set is guaranteed to be non-empty due to the invariant maintained in AllocJobs and FreeJobs methods.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/DbClient.cpp
📚 Learning: 2025-05-08T09:35:39.809Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 469
File: src/Craned/Pmix/PmixCollRing.cpp:0-0
Timestamp: 2025-05-08T09:35:39.809Z
Learning: In the PMIx implementation for CraneSched, objects referenced in asynchronous gRPC callbacks (like `coll_ring_ctx`) remain valid as long as the parent object (`this`) is not destroyed. The `Coll` class uses shared_ptr management to ensure its lifetime extends through all pending callbacks.

Applied to files:

  • src/CraneCtld/TaskScheduler.cpp
  • src/CraneCtld/TaskScheduler.h
📚 Learning: 2025-04-02T10:11:33.562Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 461
File: src/Craned/CgroupManager.cpp:685-685
Timestamp: 2025-04-02T10:11:33.562Z
Learning: In the CgroupManager's GetJobBpfMapCgroupsV2 method, the developer has confirmed that cg_ino_job_id_map will always contain the key->cgroup_id element, making the CRANE_ASSERT check appropriate rather than requiring additional error handling.

Applied to files:

  • src/CraneCtld/DbClient.cpp
📚 Learning: 2025-05-09T01:56:53.142Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/CtldPublicDefs.h:756-763
Timestamp: 2025-05-09T01:56:53.142Z
Learning: In the CraneSched codebase, the `execution_node` field in JobToD is intentionally set to the first element of `executing_craned_ids` vector without guards, as it represents the main execution node for a job. This is by design and assumes `executing_craned_ids` is never empty when `GetJobOfNode` is called.

Applied to files:

  • src/CraneCtld/DbClient.cpp
🧬 Code graph analysis (4)
src/CraneCtld/RpcService/CtldGrpcServer.cpp (1)
src/Craned/Core/SupervisorKeeper.cpp (1)
  • pair (123-183)
src/CraneCtld/CtldPublicDefs.h (1)
src/CraneCtld/CtldPublicDefs.cpp (10)
  • update (122-137)
  • update (122-122)
  • SetDependency (1172-1183)
  • SetDependency (1172-1172)
  • UpdateDependency (1185-1187)
  • UpdateDependency (1185-1185)
  • AddDependent (1189-1198)
  • AddDependent (1189-1190)
  • TriggerDependencyEvents (1200-1205)
  • TriggerDependencyEvents (1200-1201)
src/CraneCtld/TaskScheduler.cpp (1)
src/CraneCtld/TaskScheduler.h (1)
  • AddDependencyEvent (911-917)
src/CraneCtld/TaskScheduler.h (1)
src/CraneCtld/TaskScheduler.cpp (4)
  • SubmitTaskAsync (1523-1532)
  • SubmitTaskAsync (1523-1524)
  • SubmitTaskToScheduler (1679-1746)
  • SubmitTaskToScheduler (1680-1680)
🪛 LanguageTool
docs/zh/reference/job_dependency.md

[uncategorized] ~69-~69: 数词与名词之间一般应存在量词,可能缺少量词。
Context: ...100 开始 并且 作业 101 成功完成。 #### OR 逻辑(任一条件满足) 使用 ? 分隔不同的依赖条件: ```bash afterok:...

(wa5)

🪛 markdownlint-cli2 (0.18.1)
docs/zh/reference/pending_reason.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


31-31: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

docs/zh/reference/job_dependency.md

27-27: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


164-164: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

docs/en/reference/pending_reason.md

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


31-31: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (14)
protos/PublicDefs.proto (2)

135-163: Dependency proto additions look structurally sound and wire‑compatible

The new dependency modeling (DependencyType/Condition/Dependencies/DependencyStatus), plus wiring into TaskToCtld.dependencies = 15 and TaskInfo.dependency_status = 29, is consistent and backwards compatible:

  • Field numbers 15 and 29 were previously unused/reserved, and the reserved comment is adjusted accordingly.
  • Dependencies.is_or matches the documented , (AND) vs ? (OR) semantics in the new job dependency docs.
  • DependencyStatus’s ready_time_type oneof with ready_time / infinite_future / infinite_past is a reasonable encoding; the bool arms are effectively enum sentinels inside the oneof.

No changes needed here from a proto/schema perspective.

Also applies to: 186-186, 448-485


724-730: New ErrCode values are appended safely and match usage

Adding ERR_MISSING_DEPENDENCY = 85; and ERR_DB_INSERT_FAILED = 86; at the end of the ErrCode enum preserves existing numeric meanings and gives clear, specific codes for dependency resolution and DB failures.

Assuming these are wired through to callers (as seen in scheduler/DB changes), this is a good refinement of error reporting.

src/CraneCtld/RpcService/CtldGrpcServer.cpp (2)

319-333: Correctly unwraps nested CraneExpected result from SubmitTaskToScheduler in CforedStream

The new logic:

  • Checks the outer CraneExpected<std::future<CraneExpected<task_id_t>>>,
  • Calls .get() only when has_value(),
  • Then inspects the inner CraneExpected<task_id_t> and converts it to std::expected<std::pair<job_id_t, step_id_t>, std::string> using CraneErrStr,

is consistent with the updated TaskScheduler API and preserves the previous success/error behavior for interactive tasks.


471-484: SubmitBatchTask correctly propagates nested scheduler errors

For single batch submission:

  • Outer failure (result.has_value() == false) sets ok=false and code=result.error().
  • Inner failure (task_result.has_value() == false) sets ok=false and code=task_result.error().
  • Only on full success is ok=true and task_id set.

This matches the new CraneExpected<std::future<CraneExpected<task_id_t>>> signature and cleanly separates early-submit vs execution‑time errors.

src/CraneCtld/CtldPublicDefs.h (1)

1018-1023: TaskInCtld dependency API is minimal and appropriate

SetDependency, UpdateDependency, Dependencies() const, AddDependent, and TriggerDependencyEvents give a clean surface for the scheduler and recovery paths. The implementation in CtldPublicDefs.cpp matches the intended semantics (AND/OR, eager AFTER handling, and event enqueuing).

No changes needed from my side.

src/CraneCtld/TaskScheduler.cpp (5)

894-910: Dependency event draining in ScheduleThread_ is correct and cheap

The scheduler now drains m_dependency_event_queue_ once per loop while holding m_pending_task_map_mtx_ and invokes UpdateDependency on matching pending jobs. Using size_approx() + try_dequeue_bulk avoids contention, and limiting updates to pending tasks is logically correct (dependents must be pending to care).

No concurrency or correctness issues spotted.


923-930: Pending‑reason for dependency gating is well‑defined

Using Dependencies().is_met(now) to decide schedulability, and setting:

  • "DependencyNeverSatisfied" when is_failed() is true, else
  • "Dependency"

gives users clear feedback in pending_reason. This lines up cleanly with the InfiniteFuture notion of “unsatisfiable dependency”.

Looks good.


1336-1337: Triggering AFTER events at job start is correct

Emitting DependencyType::AFTER with job->StartTime() when a job enters Configuring/running ensures dependent jobs of type AFTER see the earliest start timestamp. This matches the intended semantics for “run after job starts”.

No further changes needed.


1523-1532: SubmitTaskAsync’s new CraneExpected wrapping is consistent

Switching SubmitTaskAsync to return std::future<CraneExpected<task_id_t>> and propagating promises of the same type aligns with SubmitTaskToScheduler and the broader CraneExpected adoption. The lifetime and queueing behavior remain the same.

All good here.


2823-2859: Submit queue element type change and CraneExpected propagation are fine

SubmitQueueElem is now std::pair<std::unique_ptr<TaskInCtld>, std::promise<CraneExpected<task_id_t>>>, and accepted tasks have their promises fulfilled with either:

  • std::unexpected(ERR_DB_INSERT_FAILED) when appending to embedded DB fails, or
  • The allocated task_id later in the loop.

The reverse‑order walking and SetStatus(Pending) logic are unchanged, and this fits with the new submit API.

Looks correct.

src/CraneCtld/TaskScheduler.h (4)

764-768: SubmitTaskAsync documentation matches CraneExpected semantics

The comment clarifies that the returned future either contains an error code or a newly allocated task ID. This is exactly what the new std::future<CraneExpected<task_id_t>> type does; no mismatch between declaration, comment, and implementation.


910-917: AddDependencyEvent inline helper aligns with failure semantics

The inline AddDependencyEvent that enqueues a DependencyEvent{dependent_job_id, dependee_job_id, timestamp} is straightforward and used consistently. The comment explicitly states that failed dependencies should use absl::InfiniteFuture(), matching how DependenciesInJob::is_failed is implemented.

No issues.


1070-1073: Submit queue type updated correctly to CraneExpected

m_submit_task_queue_ is now ConcurrentQueue<std::pair<std::unique_ptr<TaskInCtld>, std::promise<CraneExpected<task_id_t>>>>, matching both the SubmitTaskAsync signature and CleanSubmitQueueCb_ implementation.

Thread safety and ownership semantics remain unchanged.


1125-1132: DependencyEvent struct and queue integrate cleanly with scheduler loop

Defining DependencyEvent { dependent_job_id, dependee_job_id, event_time } and backing it with ConcurrentQueue<DependencyEvent> m_dependency_event_queue_ provides a simple, lock‑free conduit between dependency producers (job lifecycle, DB recovery) and the scheduler thread.

Given:

  • Producers only ever enqueue.
  • Consumer drains in ScheduleThread_ under m_pending_task_map_mtx_.

this design avoids contention and keeps ordering “good enough” for dependency semantics (exact total order is not required).

Looks solid.

Comment on lines +15 to +37
Example output:
```
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
101 CPU job1 user1 PD 0:00 2 (Priority)
102 CPU job2 user1 PD 0:00 4 (Resource)
103 GPU job3 user2 PD 0:00 1 (Dependency)
104 CPU job4 user1 PD 0:00 2 (Held)
```

### Using ccontrol show job

```bash
ccontrol show job 101
```

Example output:
```
JobId=101
...
State=PENDING
Reason=Priority
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add fenced-code languages for example outputs (markdownlint MD040)

The two example output blocks currently use bare code fences, which triggers MD040 and is inconsistent with the rest of the page.

You can fix this by tagging them as plain text, e.g.:

-``` 
+```text
 JOBID    PARTITION  NAME     USER   ST   TIME     NODES  NODELIST(REASON)
 ...

@@
- +text
JobId=101
...
State=PENDING
Reason=Priority

🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


31-31: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In docs/en/reference/pending_reason.md around lines 15 to 37, the example output
code fences are missing language tags which triggers markdownlint MD040; update
both fenced code blocks to use a language tag (use "text") so they become
```text ... ``` for the JOBID table block and ```text ... ``` for the ccontrol
show job block, leaving the contents unchanged.

Comment on lines +27 to +31
```
<type>:<job_id>[+<delay>][:<job_id>[+<delay>]]...
```

### 依赖类型
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Tag generic fenced blocks with a language to satisfy markdownlint

markdownlint (MD040) flags the syntax and status example blocks that use bare code fences. The rest of the doc already uses bash where appropriate.

Recommend tagging these as plain text:

-```
+```text
<type>:<job_id>[+<delay>][:<job_id>[+<delay>]]...

and:

-```
+```text
JobId=105
...
Dependency=PendingDependencies=afterok:100+01:00:00 Status=WaitForAll

This keeps semantics unchanged but silences MD040 and improves consistency.

Also applies to: 164-168

🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

27-27: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In docs/zh/reference/job_dependency.md around lines 27 to 31 (and similarly
update the fenced blocks at lines 164 to 168), the code fences are untagged
which trips markdownlint MD040; update those triple-backtick fenced blocks to
include a language tag (use text) so they read ```text ... ```; apply the same
change to both the syntax example and the status example at the other location
to keep consistency and silence the linter.

Comment on lines +15 to +37
输出示例:
```
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
101 CPU job1 user1 PD 0:00 2 (Priority)
102 CPU job2 user1 PD 0:00 4 (Resource)
103 GPU job3 user2 PD 0:00 1 (Dependency)
104 CPU job4 user1 PD 0:00 2 (Held)
```

### 使用 ccontrol show job 查看

```bash
ccontrol show job 101
```

输出示例:
```
JobId=101
...
State=PENDING
Reason=Priority
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Specify language for example output code fences (markdownlint MD040)

The cqueue and ccontrol output examples use bare fenced blocks, which markdownlint flags.

You can tag them as plain text:

-```
+```text
JOBID    PARTITION  NAME     USER   ST   TIME     NODES  NODELIST(REASON)
...

and:

-```
+```text
JobId=101
...
State=PENDING
Reason=Priority

This matches the English doc and keeps the content unchanged.

🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


31-31: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In docs/zh/reference/pending_reason.md around lines 15 to 37, the example output
code fences are untagged and trigger markdownlint MD040; update both fenced
blocks (the cqueue and ccontrol examples) to use the plain-text language tag by
replacing the opening ``` with ```text so the examples are explicitly marked as
text and match the English doc.

Comment on lines +2733 to +2749
absl::Time end_time = absl::Now();

if (!pending_task_ptr_vec.empty()) {
for (auto& task : pending_task_ptr_vec) {
task->SetStatus(crane::grpc::Cancelled);
task->SetEndTime(absl::Now());
task->SetEndTime(end_time);
g_account_meta_container->FreeQosResource(*task);

task->TriggerDependencyEvents(crane::grpc::DependencyType::AFTER,
end_time);
task->TriggerDependencyEvents(crane::grpc::DependencyType::AFTER_ANY,
end_time);
task->TriggerDependencyEvents(crane::grpc::AFTER_OK,
absl::InfiniteFuture());
task->TriggerDependencyEvents(crane::grpc::AFTER_NOT_OK,
absl::InfiniteFuture());

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Cancel‑pending behavior for dependency events may surprise AFTER_NOT_OK users

For cancelled pending tasks you now:

  • Set status Cancelled and end_time = now.
  • Emit:
    • AFTER and AFTER_ANY at end_time.
    • AFTER_OK and AFTER_NOT_OK at absl::InfiniteFuture().

Given how DependenciesInJob interprets InfiniteFuture, any dependent job with AFTER_OK or AFTER_NOT_OK on a cancelled‑pending job will end up with DependencyNeverSatisfied and never run automatically.

If the intended semantics are “AFTER_NOT_OK includes cancelled jobs” (which is how many schedulers model it), then for cancelled jobs you likely want:

  • AFTER_OK at InfiniteFuture()
  • AFTER_NOT_OK at end_time and a non‑zero exit code.

If, instead, you explicitly want cancelled‑pending jobs to neither satisfy AFTER_OK nor AFTER_NOT_OK (i.e., dependents must be resubmitted), it would be good to:

  • Document this in the dependency docs and CLI help, and
  • Perhaps log a warning when a job with such a dependency transitions to "DependencyNeverSatisfied" so users can see why.

I recommend at least clarifying/confirming the intended semantics; behavior is consistent with your current is_failed logic but might be surprising.

🤖 Prompt for AI Agents
In src/CraneCtld/TaskScheduler.cpp around lines 2733-2749, the current
cancel-path emits AFTER and AFTER_ANY at end_time but emits AFTER_OK and
AFTER_NOT_OK at absl::InfiniteFuture(), which causes dependents waiting on
AFTER_NOT_OK to be treated as "never satisfied". Change the cancel behavior so
that for cancelled pending tasks you: keep AFTER and AFTER_ANY at end_time, keep
AFTER_OK at absl::InfiniteFuture(), but emit AFTER_NOT_OK at end_time (not
InfiniteFuture()) and set a non-zero exit code on the task before triggering
AFTER_NOT_OK; ensure g_account_meta_container->FreeQosResource is still called,
and add a concise log message indicating the task was cancelled and will satisfy
AFTER_NOT_OK so dependency resolution is correct.

@github-actions github-actions bot removed the test-passed Build and test success label Dec 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

增加作业dependency的功能

2 participants