@@ -6,37 +6,307 @@ bug fixes (and other actions) for each version of Libfabric since
6
6
version 1.0. New major releases include all fixes from minor
7
7
releases with earlier release dates.
8
8
9
- v1.22.0 , Fri Jul 26 , 2024
10
- ========================
9
+ v2.0.0 alpha , Fri Aug 30 , 2024
10
+ ==============================
11
11
12
12
## Core
13
13
14
+ hmem/ze: Fix mistmatched library name in an error message
15
+ Add FI_PEER as a capability
16
+ Add missing FI_AV_USER_ID to cap tostr
17
+ Update and clarify peer SRX API flow
18
+ Prefix public xpmem symbols with ofi
19
+ Add rbmap foreach node utility function
20
+ ofi_mem: Add release bufpool validity check
21
+ hmem/rocr: Don't attempt to get device info when pointer type is unknown.
22
+ hmem: Added handle field to close_handle
23
+ Introduce new atomic datatypes and operation
24
+ Define new tag formats
25
+ Add new peer group feature
26
+ Add fi_fabric2() API
27
+ Deprecate old MR modes
28
+ Deprecate FI_WAIT_MUTEX_COND
29
+ Deprecate wait set and poll set
30
+ Require using libfabric APIs to allocate fi_info structures
31
+ Cleanup FI_ORDER flags
32
+ Deprecate support for async memory registration
33
+ Remove total_buffered_recv
34
+ Deprecate comp_order attribute
35
+ Simplify progress definition
36
+ Simplify threading models
37
+ Move FI_BUFFERED_RECV to internal flag
38
+ Simplify the AV API
39
+ Remove internally used definitions from public headers
40
+ hmem/cuda: Modify the logging for nvml dlopen
41
+ hmem/rocr: Fix dmabuf for amd gpu implementation
42
+
14
43
## CXI
15
44
45
+ FI_PATH_MAX is removed in 2.0 API
46
+
16
47
## EFA
17
48
18
- ## Hooks
49
+ Zero the cq entry array in dgram ep progress
50
+ Remove unit test for libfabric 1.1 API
51
+ Replace deprecated MR modes
52
+ Remove deprecated FI_ORDER flag
53
+ Update EP's ` inject_size ` in zero-copy mode
54
+ Add support for ` FI_OPT_INJECT_RMA_SIZE `
55
+ Query for shm's FI_PEER capability
56
+ Require FI_MR_LOCAL for zero-copy receive
57
+ Correctly handle fallback longcts-rtm send completion
58
+ Adjust the logging for pke exhaustion
59
+ Fix a memory leak in local read
60
+ Use dlist_foreach_container_safe to iterate progressed ep list
61
+ refactor hmem interface initialization
62
+ Fix a memory leak in efa_rdm_ep_post_handshake
63
+ disable zero-copy receive if p2p is not supported
64
+ Update data types for various IOV operations
65
+ Require shm to be disabled for using zero-copy recv
66
+ Register user recv buffer for zero-copy receive mode
67
+ Make fi_cancel return EOPNOTSUPP for zero copy receive mode.
68
+ Handle receive window overflow
69
+ Introduce FI_EFA_IFACE to restrict visible NICs
70
+ Allow disabling unsolicited write recv via env
71
+
72
+ ## LPP
73
+
74
+ Initial addition
19
75
20
- ## OPX
76
+ ## PSM2
21
77
22
- ## Peer
78
+ Fix incorrect unlock function
23
79
24
80
## PSM3
25
81
26
- ## RXM
82
+ Fix incorrect unlock function
27
83
28
84
## SHM
29
85
86
+ Add FI_PEER capability
87
+ Refactor ze ipc path to use pidfd
88
+
30
89
## TCP
31
90
91
+ Introduce sub-domains to support FI_THREAD_COMPLETION
92
+
32
93
## UCX
33
94
95
+ Support FI_OPT_CUDA_API_PERMITTED in fi_setopt()
96
+ Fix error code for fi_setopt()/fi_getopt()
97
+
34
98
## Util
35
99
100
+ Initialize ROCR name in memory monitor struct
101
+ Support specific placement of addr into the av
102
+
36
103
## Verbs
37
104
105
+ Fix resource leak in error handling path
106
+ Replace __ BITS_PER_LONG with LONG_WIDTH
107
+ Fix issue while displaying addresses with fi_info -a <addr_format>
108
+
38
109
## Fabtests
39
110
111
+ Add LPP specific fabtests
112
+ Add ` inject_size ` to ` ft_opts `
113
+ Add pytests for FI_MORE Test fi_rma_bw and fi_rdm_tagged_bw with flag FI_MORE.
114
+ Use fi_writemsg to test rma write/writedata with FI_MORE
115
+ Use fi_sendmsg to test rdm_tagged_bw with FI_MORE
116
+ Add option for running tests with FI_MORE
117
+ synapse: Remove dependency of scal
118
+ Pass ` memory_type ` to client server test
119
+
120
+
121
+ v1.22.0, Fri Jul 26, 2024
122
+ =========================
123
+
124
+ ## Coll
125
+
126
+ - Fix Coverity issues
127
+
128
+ ## Core
129
+
130
+ - General bug fixes
131
+ - hmem: change neuron get_dmabuf_fd error code
132
+ - Fix an error in the error handling path of fi_param_define()
133
+ - Makefile.am: Add Windows build files to distribution tarball
134
+ - hmem: disable ZE IPC
135
+ - Add profile variables for connections and memory allocated
136
+ - hmem: Fix ` cuDeviceCanAccessPeer() ` error reporting
137
+ - man: Update text for ` len ` parameter
138
+ - Add page size MR attr field
139
+ - man: Extend fi_mr_refresh support
140
+ - man: Improve FI_MR_ALLOCATED documentation
141
+ - man: Support optional MR desc
142
+ - man: Improve FI_MR_HMEM documentation
143
+ - Added ofi_get_realtime interfaces
144
+ - Add endpoint options for max message size and inject size
145
+ - Add Windows definition for ` EREMOTEIO `
146
+
147
+ ## EFA
148
+
149
+ - General improvement and bug fixes
150
+ - Handle recv cancel for zero copy recv
151
+ - Avoid iterating EP list in CQ read
152
+ - Add RDMA core errno for remote unknown peer
153
+ - Map EFA errnos to Libfabric codes
154
+ - Improve the zero-copy receive feature
155
+ - Improve the handshake enforcement procedure
156
+ - Support unsolicited rdma-write recv
157
+ - Support FI_MORE for eager send and rdma-write
158
+ - Improve the EFA_IO_COMP error code and explanation
159
+ - Improve the unit test for LL128 protocol
160
+ - Distinguish max RMA size from msg size
161
+
162
+ ## Hooks
163
+
164
+ - dmabuf: Fix incompatible pointer warning
165
+
166
+ ## OPX
167
+
168
+ - Add missing file needed for fabric direct build to release package
169
+ - Fix performance issue caused by not setting ACK bit in the single
170
+ SDMA packet case
171
+ - TID cache debug improvements
172
+ - Detection of driver lack of support for TID
173
+ - Multi-CTS support for TID
174
+ - Removal of statement that TID is not supported
175
+ - OPX Tracer improvements
176
+ - Improvements to OPX shared memory cleanup
177
+ - H to H performance improvements for build that supports HMEM
178
+ - Bug fix for a threshold check
179
+ - Bug fix for FI_SELECTIVE_COMPLETION
180
+ - CN5000 fixes
181
+ - Parameterization of various thresholds
182
+ - Further enhancements to support NVIDIA GPUs, included CUDA-allocated
183
+ bounce buffers and in-provider support for GDRCopy
184
+ - Enhancements to enable support for CN5000 hardware
185
+ - Better checking for TID support
186
+ - General TID enhancements
187
+ - Pkey error handling
188
+ - Send work queue splitting
189
+ - Support for OPX tracer for profiling purposes
190
+ - Coverity scan fixes
191
+ - Fixes and enhancements to logging and debug messages
192
+ - Intranode RMA read fixes
193
+ - Fix compile issues
194
+ - Fix shared memory segment index creation bug
195
+
196
+ ## PSM3
197
+
198
+ - Update provider to sync with IEFS 11.7.0.0.110
199
+ - Improved auto-tuning features for PSM3, including dynamic Credit Flows
200
+ and detecting the presence of the rv kernel module
201
+ - Improved PSM3 intra-node performance for large message sizes
202
+
203
+ ## SHM
204
+
205
+ - Added support for write() method to submit DSA work
206
+ - Touch all buffer pages after DSA page fault
207
+ - Add return and more descriptive error message
208
+ - Fix coverity about incorrect sign
209
+ - Fix memory leaks for srx
210
+ - Fix atomic read
211
+
212
+ ## Sockets
213
+
214
+ - Fix Coverity issues
215
+
216
+ ## USNIC
217
+
218
+ - Fix a few Coverity issues
219
+
220
+ ## Util
221
+
222
+ - Discard outstanding operations in util_srx_close
223
+ - Enable profile on the size of bufpool allocated.
224
+ - Add more predefined profile variables.
225
+ - Fix issue while displaying addresses with fi_info -a <addr_format>
226
+ - fi_pingpong: Fix out of scope memory leak
227
+ - Add source address to fi_pingpong
228
+
229
+ ## Verbs
230
+
231
+ - Flush CQ for SQ on no SQ credit
232
+ - Optimize search for device max inline size
233
+ - Enable profiling
234
+
235
+ ## Fabtests
236
+
237
+ - pytest/shm: reduce the msg size in test_unexpected_msg
238
+ - Fix synapseai fabtests build
239
+ - Add pytests for EFA zero-copy receive
240
+ - Add benchmark option for ` FI_OPT_MAX_MSG_SIZE `
241
+ - benchmarks: Add synapseai support
242
+ - Disable fi_rdm_tagged_peek test for ucx and psm3
243
+ - Add manual init sync to fi_rdm_multiclient and fi_rdm
244
+ - Refactor ft_sock_sync to take in a socket
245
+ - Add fi_rdm_bw test
246
+ - Skip rma_pingpong write tests
247
+ - Init rx_buf before sending data
248
+ - Add rma_pingpong tests to makefile
249
+ - pytest: use different message sizes for rma pingpong
250
+ - Fix missing fixture memory_type in test_rma_pingpong_range_no_inject
251
+ - pytest: account for process startup overhead in client-server tests
252
+ - pytest: save client process output to a file
253
+ - Support testing inject with cq data
254
+ - multinode: update arguments
255
+ - multi_ep: Fix memory leak
256
+ - rdm_tagged_peek: Align rx's msg_order with tx's
257
+ - Add backlog > 0 to listen call
258
+
259
+
260
+ v1.21.1, Fri July 26, 2024
261
+ ==========================
262
+
263
+ ## Core
264
+
265
+ - Fix integer overflow in ofi_get_mem_size
266
+ - Fix overflow issue in ofi_rbinit
267
+ - Disable ZE IPC due to possible memory corruption
268
+ - Fix an error in the error handling path of fi_param_define()
269
+
270
+ ## EFA
271
+
272
+ - Add tracepoints for rx pkt processing events
273
+ - Destroy rx_atomrsp_pool during ep close.
274
+ - Free user_info during ep close.
275
+ - Use srx lock from domain directly
276
+ - Fix error handling in efa_rdm_cq_poll_ibv_cq
277
+ - Move efa_rdm_cq_poll_ibv_cq to efa_rdm_cq.h
278
+ - Remove unused cq_attr
279
+ - Remove unnecessary app_info check
280
+ - Remove unnecessary ope check
281
+ - Make the inflight read msg per domain
282
+
283
+ ## SHM
284
+
285
+ - Added support for write() method to submit DSA work
286
+ - Touch all buffer pages after DSA page fault
287
+ - add return and more descriptive error message
288
+ - fix coverity about incorrect sign
289
+ - Fix memory leaks for srx
290
+ - fix atomic read
291
+
292
+ ## Verbs
293
+
294
+ - Flush CQ for SQ on no SQ credit
295
+
296
+ ## Fabtests
297
+
298
+ - efa: reset error completion entry for each fi_cq_readerr call
299
+ - pytest: Skip rma_pingpong write tests
300
+ - Init rx_buf before sending data
301
+ - Add rma_pingpong tests to makefile
302
+ - pytest: use different message sizes for rma pingpong
303
+ - Fix missing fixture memory_type in test_rma_pingpong_range_no_inject
304
+ - pytest: account for process startup overhead in client-server tests
305
+ - pytest: save client process output to a file
306
+ - Fix memory leaks for efa_exhaust_mr_reg test
307
+ - Fix memory leak in multi_ep test
308
+ - Fix memory leak in efa_info_test
309
+
40
310
41
311
v1.21.0, Fri Mar 22, 2024
42
312
========================
0 commit comments