Skip to content

[gpu]: ISTFT basic impl. #29030

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 35 commits into
base: master
Choose a base branch
from

Conversation

pkowalc1
Copy link
Contributor

@pkowalc1 pkowalc1 commented Feb 17, 2025

Basic implementation of ISTFT for GPU.

Details:

  • supports everything which supports CPU ref impl.

Tickets:

@pkowalc1 pkowalc1 added the WIP work in progress label Feb 17, 2025
@pkowalc1 pkowalc1 requested review from a team as code owners February 17, 2025 12:16
@github-actions github-actions bot added category: Core OpenVINO Core (aka ngraph) category: GPU OpenVINO GPU plugin category: CPP API OpenVINO CPP API bindings labels Feb 17, 2025
@pkowalc1 pkowalc1 requested review from a team as code owners February 17, 2025 15:56
@github-actions github-actions bot added the category: IE Tests OpenVINO Test: plugins and common label Feb 17, 2025
@@ -54,6 +54,7 @@ class OPENVINO_API ISTFT : public Op {
void set_center(const bool center);

bool get_normalized() const;
void set_normalized(const bool normalized);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed.
Is not required right, shape inference not depends on this attribute

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is needed to properly construct op on gpu side(ctor params are not supported in gpu plugin... )..

Otherwise you will construct partially undefined op which will work at this point, but only because you and me know that shape inference currently does not depend on that state.

@pkowalc1 pkowalc1 changed the title WIP: [gpu]: ISTFT ref impl. [gpu]: ISTFT basic impl. Apr 3, 2025
@pkowalc1 pkowalc1 removed the WIP work in progress label Apr 3, 2025
@mlukasze mlukasze requested a review from wilson-seok April 9, 2025 05:07
Copy link
Contributor

@praasz praasz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, for core part

@mlukasze
Copy link
Contributor

@wilson-seok please review, or assign someone to review

@@ -0,0 +1,107 @@
// Copyright (C) 2018-2025 Intel Corporation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a new(and better) infra for adding kernel: ocl_v2. Could you implement kernel on that, instead of ocl?
Here's an example of migration: #29901

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I can do it in separate PR if really needed - this one has already passed all the tests(which took a week due to unrelated CI failures...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm voting for this approach

@mlukasze mlukasze requested review from isanghao and mlukasze April 18, 2025 04:49
stream& stream = instance.get_network().get_stream();
// This is needed to clear the output memory before executing the kernel for static shapes model.
// Ref kernel assumes that output memory is already cleared.
instance.output_memory(0).fill(stream, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For proper synchronization this call should either be blocking, or result event should be added to kernel dependencies

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the kernel may not run on the same stream?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they are guaranteed to be executed on the same stream, but depending on the queue type there might be a problem:

  1. in case of in_order queue all submitted tasks (memory fill and kernel enqueue) are executed in order they are enqueued
  2. however, in case of out_of_order queue driver considers dependencies between the tasks to execute them. If there are no proper dependencies (using barriers or events), the driver doesn't guarantee that memory will be reset before the kernel launches in this case - as kernel execution might start in parallel or even before memory reset in theory

Additionally, considering that memory can be reused between primitives, there might be a situation where we reset the memory of some predecessor primitive right during its execution. Therefore, for out_of_order queue some additional barrier is required before this fill() call as well

Comment on lines +16 to +22
if (inputs.size() == 4) {
auto prim = cldnn::ISTFT(layer_type_name_ID(op), inputs[0], inputs[1], inputs[2], inputs[3], op->get_center(), op->get_normalized());
p.add_primitive(*op, prim);
} else {
auto prim = cldnn::ISTFT(layer_type_name_ID(op), inputs[0], inputs[1], inputs[2], inputs[3], inputs[4], op->get_center(), op->get_normalized());
p.add_primitive(*op, prim);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not actually needed to enumerate all the inputs, we can just slightly update ISTFT like this:

    ISTFT(const primitive_id& id,
          const std::vector<input_info>& inputs,
          const bool center,
          const bool normalized)
        : primitive_base(id, inputs),
          center(center),
          normalized(normalized) {}

And simplify primitive creation:

Suggested change
if (inputs.size() == 4) {
auto prim = cldnn::ISTFT(layer_type_name_ID(op), inputs[0], inputs[1], inputs[2], inputs[3], op->get_center(), op->get_normalized());
p.add_primitive(*op, prim);
} else {
auto prim = cldnn::ISTFT(layer_type_name_ID(op), inputs[0], inputs[1], inputs[2], inputs[3], inputs[4], op->get_center(), op->get_normalized());
p.add_primitive(*op, prim);
}
auto prim = cldnn::ISTFT(layer_type_name_ID(op), inputs, op->get_center(), op->get_normalized());
p.add_primitive(*op, prim);

@@ -0,0 +1,27 @@
// Copyright (C) 2018-2025 Intel Corporation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
// Copyright (C) 2018-2025 Intel Corporation
// Copyright (C) 2025 Intel Corporation

@sshlyapn sshlyapn added this to the 2025.2 milestone Apr 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Core OpenVINO Core (aka ngraph) category: CPP API OpenVINO CPP API bindings category: GPU OpenVINO GPU plugin category: IE Tests OpenVINO Test: plugins and common
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants