Skip to content

[SYCL] Add SYCL property set registry class #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 9 additions & 5 deletions clang/tools/clang-sycl-linker/ClangSYCLLinker.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
#include "llvm/BinaryFormat/Magic.h"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to ClangSYCLLinker.cpp can be done in a separate PR. WDYT?

Thanks

#include "llvm/Bitcode/BitcodeWriter.h"
#include "llvm/CodeGen/CommandFlags.h"
#include "llvm/Frontend/Offloading/Utility.h"
#include "llvm/IR/DiagnosticPrinter.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/IRReader/IRReader.h"
Expand Down Expand Up @@ -54,6 +55,7 @@
using namespace llvm;
using namespace llvm::opt;
using namespace llvm::object;
using namespace llvm::offloading::sycl;

/// Save intermediary results.
static bool SaveTemps = false;
Expand Down Expand Up @@ -355,18 +357,18 @@ Error runSYCLLink(ArrayRef<std::string> Files, const ArgList &Args) {
// result in multiple bitcode codes.
// The following lines are placeholders to represent multiple files and will
// be refactored once SYCL post link support is available.
SmallVector<std::string> SplitModules;
SplitModules.emplace_back(*LinkedFile);
SmallVector<std::pair<std::string, PropertySetRegistry>> SplitModules;
SplitModules.emplace_back(*LinkedFile, PropertySetRegistry{});

// SPIR-V code generation step.
for (size_t I = 0, E = SplitModules.size(); I != E; ++I) {
auto Stem = OutputFile.rsplit('.').first;
std::string SPVFile(Stem);
SPVFile.append("_" + utostr(I) + ".spv");
auto Err = runSPIRVCodeGen(SplitModules[I], Args, SPVFile, C);
auto Err = runSPIRVCodeGen(SplitModules[I].first, Args, SPVFile, C);
if (Err)
return Err;
SplitModules[I] = SPVFile;
SplitModules[I].first = SPVFile;
}

// Write the final output into file.
Expand All @@ -376,7 +378,7 @@ Error runSYCLLink(ArrayRef<std::string> Files, const ArgList &Args) {
llvm::raw_fd_ostream FS(FD, /*shouldClose=*/true);

for (size_t I = 0, E = SplitModules.size(); I != E; ++I) {
auto File = SplitModules[I];
const auto &[File, Properties] = SplitModules[I];
llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> FileOrErr =
llvm::MemoryBuffer::getFileOrSTDIN(File);
if (std::error_code EC = FileOrErr.getError()) {
Expand All @@ -385,13 +387,15 @@ Error runSYCLLink(ArrayRef<std::string> Files, const ArgList &Args) {
else
return createFileError(File, EC);
}

OffloadingImage TheImage{};
TheImage.TheImageKind = IMG_Object;
TheImage.TheOffloadKind = OFK_SYCL;
TheImage.StringData["triple"] =
Args.MakeArgString(Args.getLastArgValue(OPT_triple_EQ));
TheImage.StringData["arch"] =
Args.MakeArgString(Args.getLastArgValue(OPT_arch_EQ));
TheImage.StringData["sycl_properties"] = Properties.writeJSON();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest simple "properties".

TheImage.Image = std::move(*FileOrErr);

llvm::SmallString<0> Buffer = OffloadBinary::write(TheImage);
Expand Down
102 changes: 102 additions & 0 deletions llvm/include/llvm/Frontend/Offloading/Utility.h
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I've placed the class in Frontend/Offloading/Utility.h. If we get Frontend/Offloading/SYCL later on then I then that would be the most appropriate place to put it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be a better idea to add this in a separate file called PropertySet.h (associated impl in PropertySet.cpp).

Thanks

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding @sarnex for comment as he has worked with Utility.h before.

Thanks

Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

#include <cstdint>
#include <memory>
#include <variant>

#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/StringRef.h"
Expand Down Expand Up @@ -159,6 +160,107 @@ namespace intel {
/// the Intel runtime offload plugin.
Error containerizeOpenMPSPIRVImage(std::unique_ptr<MemoryBuffer> &Binary);
} // namespace intel

namespace sycl {
class PropertySetRegistry;

Comment on lines +165 to +166

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Thanks

// A property value. It can be either a 32-bit unsigned integer or a byte array.
class PropertyValue {
Copy link

@maksimsab maksimsab Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this class was overengineered from the very beginning.
This class can be replaced by simple std::variant<uint64_t, std::string>. In the text format it is not going to have much difference either it is uint64_t or uint32_t.

Do you have any objections againts the following declaration?

using PropertyValue = std::variant<uint64_t, std::string>;

public:
using ByteArrayTy = SmallVector<char, 8>;

PropertyValue() = default;
PropertyValue(uint32_t Val) : Value(Val) {}
PropertyValue(StringRef Data)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified the byte array code compared to the downstream code, but they won't compatible downstream and would cause ABI-break and runtime changes. The downstream implementation always wrote the number of bits the data uses in the first 8 bytes of a byte-array property value, but after some tests, I found this values way never used, except for one place that could be worked around.

https://github.com/intel/llvm/blob/a9db58476ab1e1ff87f128088ac203539d87d22b/llvm/lib/Support/PropertySetIO.cpp#L153-L169

: Value(ByteArrayTy(Data.begin(), Data.end())) {}

template <typename C, typename T = typename C::value_type>
PropertyValue(const C &Data)
: PropertyValue({reinterpret_cast<const char *>(Data.data()),
Data.size() * sizeof(T)}) {}

uint32_t asUint32() const {
assert(getType() == PV_UInt32 && "must be UINT32 value");
return std::get<uint32_t>(Value);
}

StringRef asByteArray() const {
assert(getType() == PV_ByteArray && "must be BYTE_ARRAY value");
const auto &ByteArrayRef = std::get<ByteArrayTy>(Value);
return {ByteArrayRef.data(), ByteArrayRef.size()};
}

// Note: each enumeration value corresponds one-to-one to the
// index of the std::variant.
enum Type { PV_None = 0, PV_UInt32 = 1, PV_ByteArray = 2 };
Type getType() const { return static_cast<Type>(Value.index()); }

bool operator==(const PropertyValue &Rhs) const { return Value == Rhs.Value; }

private:
std::variant<std::monostate, std::uint32_t, ByteArrayTy> Value = {};
Copy link

@maksimsab maksimsab Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just read what monostate is. It allows std::variant to be empty. However, I don't think we should allow our variants to be empty. Therefore, I would discourage a use of monostate.

};

/// A property set is a collection of PropertyValues, each identified by a name.
/// A property set registry is a collection of property sets, each identified by
/// a category name. The registry allows for the addition, removal, and
/// retrieval of property sets and their properties.
class PropertySetRegistry {
using PropertyMapTy = StringMap<unsigned>;
/// A property set. Preserves insertion order when iterating elements.
using PropertySet = MapVector<SmallString<16>, PropertyValue, PropertyMapTy>;
using MapTy = MapVector<SmallString<16>, PropertySet, PropertyMapTy>;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would discourage usage of SmallString<16>.
https://github.com/intel/llvm/blob/sycl/sycl/doc/design/PropertySets.md
Typically, keys are larger than 16 and we wouldn't make use of this limit.
Also, LLVM's Programmers Manual discourages putting SmallString as the member of a frequently-allocated heap data structure.

I would rather make it std::map<std::string, std::map<std::string, PropertyValue>>.

We aren't going to have many keys here and it would make the interface extremely simple.


public:
/// Function for bulk addition of an entire property set in the given
/// \p Category .
template <typename MapTy> void add(StringRef Category, const MapTy &Props) {
assert(PropSetMap.find(Category) == PropSetMap.end() &&
"category already added");
auto &PropSet = PropSetMap[Category];

for (const auto &Prop : Props)
PropSet.insert_or_assign(Prop.first, PropertyValue(Prop.second));
}

/// Adds the given \p PropVal with the given \p PropName into the given \p
/// Category .
template <typename T>
void add(StringRef Category, StringRef PropName, const T &PropVal) {
auto &PropSet = PropSetMap[Category];
PropSet.insert({PropName, PropertyValue(PropVal)});
}

void remove(StringRef Category, StringRef PropName) {
auto PropertySetIt = PropSetMap.find(Category);
if (PropertySetIt == PropSetMap.end())
return;
auto &PropertySet = PropertySetIt->second;
auto PropIt = PropertySet.find(PropName);
if (PropIt == PropertySet.end())
return;
PropertySet.erase(PropIt);
}

static Expected<PropertySetRegistry> readJSON(MemoryBufferRef Buf);

/// Dumps the property set registry to the given \p Out stream.
void writeJSON(raw_ostream &Out) const;
SmallString<0> writeJSON() const;

MapTy::const_iterator begin() const { return PropSetMap.begin(); }
MapTy::const_iterator end() const { return PropSetMap.end(); }

/// Retrieves a property set with given \p Name .
PropertySet &operator[](StringRef Name) { return PropSetMap[Name]; }
/// Constant access to the underlying map.
const MapTy &getPropSets() const { return PropSetMap; }

private:
MapTy PropSetMap;
};

} // namespace sycl
} // namespace offloading
} // namespace llvm

Expand Down
102 changes: 102 additions & 0 deletions llvm/lib/Frontend/Offloading/Utility.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -459,3 +459,105 @@ Error offloading::intel::containerizeOpenMPSPIRVImage(
Img = MemoryBuffer::getMemBufferCopy(YamlFile);
return Error::success();
}

namespace llvm::offloading::sycl {

void PropertySetRegistry::writeJSON(raw_ostream &Out) const {
json::OStream J(Out);
J.object([&] {
for (const auto &PropSet : PropSetMap) {
// Note: we do not output the property sets as objects,
// but as arrays of [name, value] arrays because we have to
// preserve the order of insertion of the properties
// in the property set (note: this currently only used by the
// spec constant metadata). Even if the key-value pairs
// of an object are serialized in the order they were inserted,
// the order is not guaranteed to be preserved when
// deserializing because json::Object uses a DenseMap to store
// its key-value pairs.
J.attributeArray(PropSet.first, [&] {
for (const auto &Props : PropSet.second) {
J.array([&] {
J.value(Props.first);
switch (Props.second.getType()) {
case PropertyValue::PV_UInt32:
J.value(Props.second.asUint32());
break;
case PropertyValue::PV_ByteArray: {
StringRef ByteArrayRef = Props.second.asByteArray();
J.value(json::Array(ByteArrayRef.bytes()));
break;
}
default:
llvm_unreachable(("unsupported property type: " +
utostr(Props.second.getType()))
.c_str());
}
});
}
});
}
});
}

SmallString<0> PropertySetRegistry::writeJSON() const {
SmallString<0> Str;
raw_svector_ostream OS(Str);
writeJSON(OS);
return Str;
}

Expected<PropertySetRegistry>
PropertySetRegistry::readJSON(MemoryBufferRef Buf) {
PropertySetRegistry Res;
Expected<json::Value> V = json::parse(Buf.getBuffer());
if (!V)
return V.takeError();
const json::Object *O = V->getAsObject();
if (!O)
return createStringError("expected JSON object");
for (const auto &[CategoryName, Value] : *O) {
const json::Array *PropsArray = Value.getAsArray();
if (!PropsArray)
return createStringError("expected JSON array for properties");
PropertySet &PropSet = Res.PropSetMap[StringRef(CategoryName)];
for (const json::Value &PropPair : *PropsArray) {
const json::Array *PropArray = PropPair.getAsArray();
if (!PropArray || PropArray->size() != 2)
return createStringError(
"expected property as [PropertyName, PropertyValue] pair");

const json::Value &PropNameVal = (*PropArray)[0];
const json::Value &PropValueVal = (*PropArray)[1];

std::optional<StringRef> PropName = PropNameVal.getAsString();
if (!PropName)
return createStringError("expected property name as string");

PropertyValue Prop;
if (std::optional<uint64_t> Val = PropValueVal.getAsUINT64()) {
Prop = PropertyValue(static_cast<uint32_t>(*Val));
} else if (const json::Array *Val = PropValueVal.getAsArray()) {
SmallVector<unsigned char, 8> Vec;
for (const json::Value &V : *Val) {
std::optional<uint64_t> Byte = V.getAsUINT64();
if (!Byte)
return createStringError("invalid byte array value");
if (*Byte > std::numeric_limits<unsigned char>::max())
return createStringError("byte array value out of range");
Vec.push_back(static_cast<unsigned char>(*Byte));
}
Prop = PropertyValue(Vec);
} else {
return createStringError("unsupported property type");
}

auto [It, Inserted] = PropSet.insert({*PropName, Prop});
if (!Inserted)
return createStringError("duplicate property name");
}
}
return Res;
}

} // namespace llvm::offloading::sycl
1 change: 1 addition & 0 deletions llvm/unittests/Frontend/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ add_llvm_unittest(LLVMFrontendTests
OpenMPParsingTest.cpp
OpenMPCompositionTest.cpp
OpenMPDecompositionTest.cpp
PropertySetRegistryTest.cpp

DEPENDS
acc_gen
Expand Down
39 changes: 39 additions & 0 deletions llvm/unittests/Frontend/PropertySetRegistryTest.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
#include "llvm/ADT/SmallVector.h"
#include "llvm/Frontend/Offloading/Utility.h"
#include "llvm/Support/MemoryBuffer.h"
#include "gtest/gtest.h"

using namespace llvm::offloading::sycl;
using namespace llvm;

void checkEquality(const PropertySetRegistry &PSR1,
const PropertySetRegistry &PSR2) {
ASSERT_EQ(PSR1.getPropSets().size(), PSR2.getPropSets().size());
for (const auto &[Category1, PropSet1] : PSR1.getPropSets()) {
auto It = PSR2.getPropSets().find(Category1);
ASSERT_TRUE(It != PSR2.getPropSets().end());
const auto &[Category2, PropSet2] = *It;
ASSERT_EQ(PropSet1.size(), PropSet2.size());
for (auto It1 = PropSet1.begin(), It2 = PropSet2.begin(),
E = PropSet1.end();
It1 != E; ++It1, ++It2) {
const auto &[PropName1, PropValue1] = *It1;
const auto &[PropName2, PropValue2] = *It2;
ASSERT_EQ(PropName1, PropName2);
ASSERT_EQ(PropValue1, PropValue2);
}
}
}

TEST(PropertySetRegistryTest, PropertySetRegistry) {
PropertySetRegistry PSR;
PSR.add("Category1", "Prop1", 42);
PSR.add("Category1", "Prop2", "Hello");
SmallVector<int, 3> arr = {4, 16, 32};
PSR.add("Category2", "A", arr);
auto Serialized = PSR.writeJSON();
auto PSR2 = PropertySetRegistry::readJSON({Serialized, ""});
if (auto Err = PSR2.takeError())
FAIL();
checkEquality(PSR, *PSR2);
}