Skip to content

CREATE MV is stuck in acquire_creating_streaming_job_permit when there are no other creation job running in the cluster #24605

@hzxa21

Description

@hzxa21

Describe the bug

I saw create mv foreground ddl being stuck in frontend while show jobs display empty result twice this week when triggering MV creation in shuriken this week. The meta await tree shows that it is stuck in acquire_creating_streaming_job_permit even when no job has been created

--- Meta Traces ---
>> DDL Command 2
CreateStreamingJob(holder_is_insider) [765.906s]
  create_streaming_job(MaterializedView:...) [!!! 765.906s]
    acquire_creating_streaming_job_permit [!!! 765.876s]


>> Global Barrier Worker
Global Barrier Worker [56122.560s]
  next_barrier [369.998ms]
    next_scheduled_barrier [369.998ms]
  control_stream_next_event [369.998ms]

One thing worth mentioning is that we use CANCEL JOB .... + RECOVER to cancel the previous job creation. I suspect there is a corner case somewhere that can cause the permit won't be released.

Error message/log


To Reproduce

No response

Expected behavior

No response

How did you deploy RisingWave?

No response

The version of RisingWave

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugType: Bug. Only for issues.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions