-
-
Notifications
You must be signed in to change notification settings - Fork 290
Add predefined datatypes for bfloat16 data #5402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Add predefined datatypes for bfloat16 data #5402
Conversation
Adds predefined datatypes for little- and big-endian bfloat16 data Does not add support for any native bfloat16 types; datatype conversions are performed in software
Essentially finished, pending CI results and some potential testing on a big-endian system |
H5T_IEEE_F32BE | H5T_IEEE_F32LE | | ||
H5T_IEEE_F64BE | H5T_IEEE_F64LE | | ||
H5T_NATIVE_FLOAT16 | H5T_NATIVE_FLOAT | | ||
H5T_FLT_BF16BE | H5T_FLT_BF16LE | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open to any alternative names that seem more fitting, but this type is distinct from the IEEE standard, so I basically created a new category for alternative floating-point formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps "FLOAT" instead of "FLT"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree - I like "FLOAT" better also
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, maybe "NONSTD", since it's a contrast to the "IEEE" label?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So... H5T_NONSTD_BF16
or H5T_NONSTD_BFLOAT16
? I like "FLOAT" explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue I see with that is that it's not unlikely that these types get adopted into some standard in the future, whether it's IEEE or not.
if (NULL == (bf16_be_dt = H5I_object(H5T_FLT_BF16BE))) | ||
HGOTO_ERROR(H5E_ARGS, H5E_BADTYPE, NULL, "not a data type"); | ||
|
||
/* Promote bfloat16 to float instead of float16, as it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bfloat16 types should be promoted to float
instead of float16, as the type is the same size as float16, but a different format. Converting between bfloat16 and float
is also very simple (by design).
if (size == 2) | ||
p_type = H5Tcopy(H5T_IEEE_F16LE); | ||
if (size == 2) { | ||
if (true == H5Tequal(tid, H5T_IEEE_F16LE) || true == H5Tequal(tid, H5T_IEEE_F16BE)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to make sure the correct type between float16 and bfloat16 is picked so that the data comes out correctly.
hsize_t dims[2], adims[1]; | ||
|
||
/* | ||
* bfloat16 keeps approximately the same range as the IEEE 32-bit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually better tests can be written, but for now this PR just adds support for predefined types and doesn't add support for a native type. GCC and Clang both have support for a __bf16 type at this point though, so it should be doable in the future.
I'll try to review tomorrow and get you some feedback |
It was my undesrtanding that middle part "_IEEE_" or similar represents
architecture or a standard. Should we follow the same rule here?
…On Mon, Mar 24, 2025 at 12:11 PM Quincey Koziol ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In doxygen/dox/DDLBNF200.dox
<#5402 (comment)>:
> H5T_IEEE_F32BE | H5T_IEEE_F32LE |
H5T_IEEE_F64BE | H5T_IEEE_F64LE |
- H5T_NATIVE_FLOAT16 | H5T_NATIVE_FLOAT |
+ H5T_FLT_BF16BE | H5T_FLT_BF16LE |
Agree - I like "FLOAT" better also
—
Reply to this email directly, view it on GitHub
<#5402 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADLFT3POWBKSASYVAQ4VSTD2WA4CRAVCNFSM6AAAAABZRCFMH2VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDOMJRGEZTANZUHA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Right - that was why I thought "NONSTD" might be better |
Maybe "NONSTD_BFLOAT16" ? |
I'm not sure that this is necessarily the case and I'd argue we should consider abandoning that. For example, |
That name is a bit weird to me because bfloat16 is the standard. There's no other standard for the type, it's just that the standard isn't an IEEE standard. |
Also consider I plan to add support for FP8, FP6 and FP4 following this PR, and in each case the formats essentially are the standard until they have wider adoption. https://arxiv.org/abs/2209.05433 for example. |
Since Google created it, maybe "GOOGLE_BFLOAT16" ? |
I was thinking the same... |
Frankly speaking, I would drop middle part for these types and document their implementation. I would vote against introducing GOOGLE. |
Perhaps. The reason I used BF16 on the end part was just to match our existing conventions like |
I was ok with |
For the time being, I'm going to proceed on the other datatypes using I propose that we use a convention of |
Makes sense, but I would leave old types as they are and use new standard only for the new types. |
I certainly wouldn't want to go changing old type names as it would be nothing but a source of annoyance (though we can always introduce new macros that just point to the old names if we wanted). |
I don't think the C standard specifies the exact integer format, or even that it's two's complement, so we shouldn't name types "C". I'm not sure there's a better name for "normal" integer types than STD, but I might be wrong. Adding new names for consistency is fine, and something like H5T_INTEGER_U32LE is probably fine, though H5T_INTEGER_I32LE seems redundant. Maybe _S32LE or just _32LE? |
Note that this is the case in C23 at least.
I tend to agree, it's just that STD doesn't really particularly mean anything.
Agree. Also why I mentioned we may want to revisit what should go in the "specifics" part of the name if we use the convention. FLOAT_XXX_F32LE is just as redundant, so maybe the single letter on the end of these isn't needed and we should consider something else? |
Jordan, is this a summary of what you are proposing:
|
Yes, this looks like a good summary. Thanks! |
Adds predefined datatypes for little- and big-endian bfloat16 data
Does not add support for any native bfloat16 types; datatype conversions are performed in software
Also adds missing float16 predefined types to fortran