Skip to content

OpenCL object lifetimes #471

Open
Open
@manuelmohr

Description

Version

85dfbf7

What is your question or problem?

While porting an existing OpenCL application that previously ran on the standard Intel OpenCL platform targeting integrated GPUs to now run on an OpenCL platform built using the Construction Kit, I observed an exception.

After reducing the test case, the problem boiled down to this (full test case below):

    cl::Kernel kernel;
    {
        cl::Program program(context, sources);
        program.build({device});
        kernel = cl::Kernel(program, "foo");
    }

    cl::Program program = kernel.getInfo<CL_KERNEL_PROGRAM>();

So this program constructs a kernel object, lets the Program object go out of scope (which will call clReleaseProgram()), and then tries to recover the Program object using a getInfo() call on the kernel object.

This works with the Intel OpenCL platform but fails with the Construction Kit because the clGetKernelInfo call fails. With the Construction Kit, after the Program object goes out of scope, the backing cl_program object will have an internal reference count of 1 (because it's still referenced by the cl_kernel) and an external reference count of 0 (because at that point, no application-visible references to the cl_program exist). Trying to build a Program object again will try to call clRetainProgram(), which will fail because of a special handling for exactly this case.

Obviously, what the program is trying to do here seemed a little bit fishy so I tried to understand what the expected behavior is, which proved to be surprisingly difficult. I found KhronosGroup/OpenCL-Docs#620, which seems to discuss exactly this topic, and if I understood the situation correctly, both behaviors (Intel and Construction Kit) are covered by the current OpenCL spec. The Construction Kit seems to take an approach that isn't explicitly considered in that discussion: it's almost like the "variant (2)" mentioned in the discussion (so it tracks internal references using a separate internal reference counter, but waits for both the application-visible reference counter to hit zero and the internal reference counter to hit zero before going through the process of deleting the object), but then it explicitly disallows "resurrecting" objects from attached objects (i.e., going from internal=1 external=0 to internal=1 external=1 is forbidden).

(Out of interest, I also ran the referenced OpenCL object lifetime validation layer against this program, but it didn't point out a problem, but to be fair, I don't fully understand if it covers this situation. Probably it doesn't.)

I think the test program is incorrect according to the OpenCL specification because for the cl_program object, the number of release calls is equal to the number of retain calls, thus it must be regarded as inaccessible. The fact that this object is still (internally) referenced by the cl_kernel object and thus (technically) "recoverable" isn't important. Do you agree with this?

Was there a specific reason for picking this approach to object lifetimes in the Construction Kit? What are the downsides of allowing "resurrecting" objects?

Full test application:

#define CL_HPP_ENABLE_EXCEPTIONS
#define CL_HPP_TARGET_OPENCL_VERSION 300
#include <CL/opencl.hpp>
#include <iostream>
#include <vector>

int main() try {
    std::vector<cl::Platform> all_platforms;
    cl::Platform::get(&all_platforms);

    if (all_platforms.empty()) {
        std::cerr << "No platforms found.\n";
        return 1;
    }
    cl::Platform platform = all_platforms[0];

    std::vector<cl::Device> all_devices;
    platform.getDevices(CL_DEVICE_TYPE_ALL, &all_devices);
    if (all_devices.empty()) {
        std::cerr << "No devices found.\n";
        return 1;
    }

    cl::Device device = all_devices[0];
    std::cout<< "Using device: " << device.getInfo<CL_DEVICE_NAME>() << "\n";

    cl::Context context({device});

    cl::Program::Sources sources;
    std::string kernel_code = "void kernel foo() {}";
    sources.push_back({kernel_code.c_str(), kernel_code.length()});

    cl::Kernel kernel;
    {
        cl::Program program(context, sources);
        if (program.build({device}) != CL_SUCCESS) {
            std::cerr << "Error building: " << program.getBuildInfo<CL_PROGRAM_BUILD_LOG>(device) << std::endl;
            return 1;
        }

        kernel = cl::Kernel(program, "foo");
    }

    cl::Program program = kernel.getInfo<CL_KERNEL_PROGRAM>();

    return 0;
}
catch (const cl::Error& err) {
    std::cerr << "CL error: " << err.what() << std::endl;
    return 1;
}

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions