You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to determine how each type of the memory can be shared across all the participating devices.
By what I mean share what properties (in terms of possible uses) each kind of memory has.
For now I determined following properties, but might as well change or may need to be extended:
For now those seem to be sufficient and our of interest.
Following flag enumeration can describe memory we'll be dealing with:
enumclassDevMemAccessFlag : unsignedchar
{
INVALID = 0,
READ_ONLY = enum_as_flag_v<DevMemAccessKind::READ>,
OVERWRITE = enum_as_flag_v<DevMemAccessKind::WRITE>,
CONCURRENT_READ = enum_as_flag_v<DevMemAccessKind::SHARED_CONCURRENT_READ>,
CONCURRENT_UPDATE = enum_as_flag_v<DevMemAccessKind::SHARED_CONCURRENT_UPDATE>,
ATOMIC_UPDATE = enum_as_flag_v<DevMemAccessKind::ATOMIC>,
ATOMIC_FLOAT = enum_as_flag_v<DevMemAccessKind::ATOMIC_FLOAT>,
// Convinence aliases:
UPDATE = enum_as_flag_v<DevMemAccessKind::READ> | enum_as_flag_v<DevMemAccessKind::WRITE>,
FINE_GRAINED_ACCESS_MASK = (enum_as_flag_v<DevMemAccessKind::CONCURRENT_UPDATE>
| enum_as_flag_v<DevMemAccessKind::CONCURRENT_READ>),
};
/* given enum_as_flag_v defined as: (1U << unsigned(EnumeratorValue)) Meaning enumerator numeric value bit set to 1*/
Flag value INVALID - means no access.
While flag values:
READ_ONLY
OVERWRITE
ATOMIC_UPDATE
ATOMIC_FLOAT
are pretty self explanatory the flag bits:
CONCURRENT_READ
CONCURRENT_WRITE
The meaning of those is if we're allowed to concurrently access memory from other device (in non-overlaping range to being updated).
This disregards how efficient such access might be, but only if it is possible. We do not encode memory NUMA topology here.
This concept is orthogonal and should be tracked elsewhere - as such not concern of this API.
The main implication of this flags is if device (accelerator/host) needs exclusivity while accessing such memory.
In terms of OpenCL Shared Virtual Memory it maps to Coarse-grained (when the device needs exclusive access) vs Fine Grained (Concurrent access is allowed).
During nntrainer context creation once we enumerated all devices our responsibility is determine supported memory kinds/types.
First order of business is to create list of all devices (including implicit HostDevice)
corresponding to abstract interface class DeviceInfo.
In my POC implementation this list is created once for nntrainer context, it is immutable during the lifetime of the context and its lifetime
is longer than other elements of whole subsystem.
For now I refere to it as DeviceInfoList:
Given that we ought to create memory descriptions for each memory type.
Those can be described for example as square N times N matrix of DevMemAccessFlag where N
is the length
The DeviceInfo aggregates list of MemoryKindAccessCapDescription corresponding to various memories of the device.
In one to one relation.
The DevicesContext aggregates list of DeviceMemoryPool's and DeviceAllocator's in many to one relation.
Descriptions should be immutable objects once created so other parts of an API might refer the to them by handles called Descriptors.
Descriptors should be lightweight, example index in list, or the pointer.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
We need to determine how each type of the memory can be shared across all the participating devices.
By what I mean share what properties (in terms of possible uses) each kind of memory has.
For now I determined following properties, but might as well change or may need to be extended:
For now those seem to be sufficient and our of interest.
Following flag enumeration can describe memory we'll be dealing with:
Flag value
INVALID- means no access.While flag values:
READ_ONLYOVERWRITEATOMIC_UPDATEATOMIC_FLOATare pretty self explanatory the flag bits:
CONCURRENT_READCONCURRENT_WRITEThe meaning of those is if we're allowed to concurrently access memory from other device (in non-overlaping range to being updated).
This disregards how efficient such access might be, but only if it is possible. We do not encode memory NUMA topology here.
This concept is orthogonal and should be tracked elsewhere - as such not concern of this API.
The main implication of this flags is if device (accelerator/host) needs exclusivity while accessing such memory.
In terms of OpenCL Shared Virtual Memory it maps to Coarse-grained (when the device needs exclusive access) vs Fine Grained (Concurrent access is allowed).
During nntrainer context creation once we enumerated all devices our responsibility is determine supported memory kinds/types.
First order of business is to create list of all devices (including implicit HostDevice)
corresponding to abstract interface
class DeviceInfo.In my POC implementation this list is created once for nntrainer context, it is immutable during the lifetime of the context and its lifetime
is longer than other elements of whole subsystem.
For now I refere to it as
DeviceInfoList:Given that we ought to create memory descriptions for each memory type.
Those can be described for example as square
NtimesNmatrix ofDevMemAccessFlagwhereNis the length
Lets imagine the nntrainer context with two devices, one of them is always present and implicit host device other once is accelerator.
What such matrix will hold may hold consider our possibilities
Based on that we can define:
Which faclitates both
MemoryKindAccessCapDescritionandMemoryPoolAccessCapDescriptionThe
DeviceInfoaggregates list ofMemoryKindAccessCapDescriptioncorresponding to various memories of the device.In one to one relation.
The
DevicesContextaggregates list ofDeviceMemoryPool'sandDeviceAllocator'sin many to one relation.Descriptions should be immutable objects once created so other parts of an API might refer the to them by handles called
Descriptors.Descriptors should be lightweight, example index in list, or the pointer.
Beta Was this translation helpful? Give feedback.
All reactions