|
1 | | -## Programming difference between NetCDF4 and PnetCDF |
| 1 | +# Comparison of PnetCDF and NetCDF4 |
2 | 2 |
|
| 3 | +* [Supported File Formats](#supported-file-formats) |
| 4 | +* [Programming Differences](#programming-differences) |
| 5 | +* [Define Mode and Data Mode](#define-mode-and-data-mode) |
| 6 | +* [Collective and Independent I/O Mode](#collective-and-independent-io-mode) |
| 7 | +* [Blocking vs. Nonblocking APIs](#blocking-vs-nonblocking-apis) |
| 8 | + |
| 9 | +--- |
| 10 | + |
| 11 | +## Supported File Formats |
| 12 | +* NetCDF4 supports both classic and HDF5-based file formats. |
| 13 | + + Classic file format (CDF-1) -- The ESDS Community Standard defined the file format |
| 14 | + to be used in the NetCDF user community in 1989. The file header bears a |
| 15 | + signature of character string 'CDF-1' and now is commonly referred to as |
| 16 | + [CDF-1](https://parallel-netcdf.github.io/doc/c-reference/pnetcdf-c/CDF_002d1-file-format-specification.html) |
| 17 | + file format. |
| 18 | + * 'CDF-2' format -- The CDF-1 format was later extended to support large |
| 19 | + file size (i.e. larger than 2GB) in 2004. See its specification in |
| 20 | + [ESDS-RFC-011v2.0](https://cdn.earthdata.nasa.gov/conduit/upload/496/ESDS-RFC-011v2.00.pdf). |
| 21 | + Because its file header bears a signature of 'CDF-2' and the format is |
| 22 | + also commonly referred to as |
| 23 | + [CDF-2](https://parallel-netcdf.github.io/doc/c-reference/pnetcdf-c/CDF_002d2-file-format-specification.html) |
| 24 | + format. |
| 25 | + * [CDF-5](https://parallel-netcdf.github.io/doc/c-reference/pnetcdf-c/CDF_002d5-file-format-specification.html) |
| 26 | + format -- The CDF-2 format was extended by PnetCDF developer team |
| 27 | + in 2009 to support large variables and additional large data types, such |
| 28 | + as 64-bit integer. |
| 29 | + + HDF5-based file format -- Starting from its version 4.0.0, NetCDF includes |
| 30 | + the format that is based on HDF5, which is referred to as NetCDF-4 format. |
| 31 | + This offer new features such as groups, compound types, variable length |
| 32 | + arrays, new unsigned integer types, etc. |
| 33 | +* PnetCDF supports only the classic file formats. |
| 34 | + + The classic files created by applications using NetCDF4 library can be read |
| 35 | + by the PnetCDF library and vice versa. |
| 36 | + + PnetCDF provides parallel I/O for accessing files in the classic format. |
| 37 | + NetCDF4's parallel I/O for classic files makes use of PnetCDF library |
| 38 | + underneath. Such feature can be enabled when building NetCDF4 library. |
| 39 | + |
| 40 | + |
| 41 | +--- |
| 42 | + |
| 43 | +## Programming Differences |
3 | 44 | * The API names are different between NetCDF4 and PnetCDF. |
4 | 45 | + For C programming, NetCDF4 uses prefix `nc_` while PnetCDF uses `ncmpi_`. |
5 | 46 | + For Fortran 77 programming, NetCDF4 uses prefix `nf_` while PnetCDF uses `nfmpi_`. |
|
35 | 76 | | ${\textsf{\color{green}nc\\_put\\_vara\\_float}}$(ncid, varid, start, count, buf); | ${\textsf{\color{blue}ncmpi\\_put\\_vara\\_float\\_all}}$(ncid, varid, start, count, buf); | |
36 | 77 | | /* close file */ | | |
37 | 78 | | ${\textsf{\color{green}nc\\_close}}$(ncid); | ${\textsf{\color{blue}ncmpi\\_close}}$(ncid); | |
| 79 | + |
| 80 | + |
| 81 | +--- |
| 82 | + |
| 83 | +## Define Mode and Data Mode |
| 84 | + |
| 85 | +In PnetCDF, an opened file is in either define mode or data mode. Switching |
| 86 | +between the modes is done by explicitly calling `"ncmpi_enddef()"` and |
| 87 | +`"ncmpi_redef()"`. NetCDF4 when operating on an HDF5-based file has no such |
| 88 | +mode switching requirement. The reason of PnetCDF enforcing such a requirement |
| 89 | +is to ensure the metadata consistency across all the MPI processes and keep the |
| 90 | +overhead of metadata synchronization small. |
| 91 | + |
| 92 | +* Define mode |
| 93 | + + When calling `"ncmpi_create()"` to create a new file, the file is |
| 94 | + automatically put in the define mode. While in the define mode, the user |
| 95 | + program can create new dimensions, new variables, and netCDF attributes. |
| 96 | + Modification of these data objects' metadata can only be done when the file |
| 97 | + is in the define mode. |
| 98 | + + When opening an existing file, the opened file is automatically put in the |
| 99 | + data mode. To add or modify the metadata, the user program must call |
| 100 | + `"ncmpi_redef()"`. |
| 101 | + |
| 102 | +* Data mode |
| 103 | + + Once the creation or modification of metadata is complete, the user program |
| 104 | + must call `"ncmpi_enddef()"` to leave the define mode and enter the data |
| 105 | + mode. |
| 106 | + + While an open file is in data mode, the user program can make read and |
| 107 | + write requests to that variables that have been created. |
| 108 | + |
| 109 | +<ul> |
| 110 | + <li> A PnetCDF example codes below show switching between define and data |
| 111 | + modes after creating a new file.</li> |
| 112 | + <li> <details> |
| 113 | + <summary>Example code fragment (click to expand)</summary> |
| 114 | + |
| 115 | +```c |
| 116 | + #include <mpi.h> |
| 117 | + #include <pnetcdf.h> |
| 118 | + ... |
| 119 | + /* Create the file */ |
| 120 | + ncmpi_create(MPI_COMM_WORLD, filename, NC_CLOBBER, MPI_INFO_NULL, &ncid); |
| 121 | + |
| 122 | + ... |
| 123 | + /* Define dimensions */ |
| 124 | + ncmpi_def_dim(ncid, "Y", 16, &dimid[0]); |
| 125 | + ncmpi_def_dim(ncid, "X", 32, &dimid[1]); |
| 126 | + |
| 127 | + /* Define a 2D variable of integer type */ |
| 128 | + ncmpi_def_var(ncid, "grid", NC_INT, 2, dimid, &varid); |
| 129 | + |
| 130 | + /* Add an attribute of string type to the variable */ |
| 131 | + str_att = "example attribute of type text"; |
| 132 | + ncmpi_put_att_text(ncid, varid, "str_att_name", strlen(str_att), str_att); |
| 133 | + |
| 134 | + /* Exit the define mode */ |
| 135 | + ncmpi_enddef(ncid); |
| 136 | + |
| 137 | + /* Write to a subarray of the variable */ |
| 138 | + MPI_Offset start[2], count[2]; |
| 139 | + start[0] = 4; |
| 140 | + start[1] = 8; |
| 141 | + count[0] = 10; |
| 142 | + count[1] = 10; |
| 143 | + ncmpi_put_vara_int_all(ncid, varid, start, count, buf_int); |
| 144 | + |
| 145 | + /* Re-enter the define mode */ |
| 146 | + ncmpi_redef(ncid); |
| 147 | + |
| 148 | + /* Define a new 2D variable of float type */ |
| 149 | + ncmpi_def_var(ncid, "temperature", NC_FLOAT, 2, dimid, &var_flt); |
| 150 | + |
| 151 | + /* Exit the define mode */ |
| 152 | + ncmpi_enddef(ncid); |
| 153 | + |
| 154 | + /* Write to a subarray of the variable, var_flt */ |
| 155 | + start[0] = 2; |
| 156 | + start[1] = 8; |
| 157 | + count[0] = 5; |
| 158 | + count[1] = 5; |
| 159 | + ncmpi_put_vara_float_all(ncid, var_flt, start, count, buf_flt); |
| 160 | + |
| 161 | + /* Close the file */ |
| 162 | + ncmpi_close(ncid); |
| 163 | +``` |
| 164 | +</details></li> |
| 165 | +
|
| 166 | + <li> An example shows switching between define and data modes after opening an existing file. |
| 167 | + </li> |
| 168 | + <li> <details> |
| 169 | + <summary>Example code fragment (click to expand)</summary> |
| 170 | +
|
| 171 | +```c |
| 172 | + #include <mpi.h> |
| 173 | + #include <pnetcdf.h> |
| 174 | + ... |
| 175 | + /* Opening an existing file */ |
| 176 | + ncmpi_open(MPI_COMM_WORLD, filename, NC_NOWRITE, MPI_INFO_NULL, &ncid); |
| 177 | +
|
| 178 | + ... |
| 179 | + /* get the ID of variable named 'grid', a 2D variable of integer type */ |
| 180 | + ncmpi_inq_varid(ncid, "grid", &varid); |
| 181 | +
|
| 182 | + /* Read the variable's attribute named "str_att_name" */ |
| 183 | + char str_att[64]; |
| 184 | + ncmpi_get_att_text(ncid, varid, "str_att_name", str_att); |
| 185 | +
|
| 186 | + /* Read a subarray of the variable, var */ |
| 187 | + MPI_Offset start[2], count[2]; |
| 188 | + start[0] = 4; |
| 189 | + start[1] = 8; |
| 190 | + count[0] = 10; |
| 191 | + count[1] = 10; |
| 192 | + ncmpi_get_vara_int_all(ncid, varid, start, count, buf_int); |
| 193 | +
|
| 194 | + /* Re-enter the define mode */ |
| 195 | + ncmpi_redef(ncid); |
| 196 | +
|
| 197 | + /* Define a new 2D variable of double type */ |
| 198 | + ncmpi_def_var(ncid, "precipitation", NC_DOUBLE, 2, dimid, &var_dbl); |
| 199 | +
|
| 200 | + /* Add an attribute of string type to the variable */ |
| 201 | + str_att = "mm/s"; |
| 202 | + ncmpi_put_att_text(ncid, var_dbl, "unit", strlen(str_att), str_att); |
| 203 | +
|
| 204 | + /* Exit the define mode */ |
| 205 | + ncmpi_enddef(ncid); |
| 206 | +
|
| 207 | + /* Write to a subarray of the variable, var_dbl */ |
| 208 | + start[0] = 2; |
| 209 | + start[1] = 8; |
| 210 | + count[0] = 5; |
| 211 | + count[1] = 5; |
| 212 | + ncmpi_put_vara_double_all(ncid, var_dbl, start, count, buf_dbl); |
| 213 | +
|
| 214 | + /* Close the file */ |
| 215 | + ncmpi_close(ncid); |
| 216 | +``` |
| 217 | +</details></li> |
| 218 | +</ul> |
| 219 | + |
| 220 | + |
| 221 | +--- |
| 222 | +## Collective and Independent I/O Mode |
| 223 | + |
| 224 | +The terminology of collective and independent I/O comes from MPI standard. A |
| 225 | +collective I/O function call requires all the MPI processes opening the same |
| 226 | +file to participate. On the other hand, an independent I/O function can be |
| 227 | +called by an MPI process independently from others. |
| 228 | + |
| 229 | +For metadata I/O, both PnetCDF and NetCDF4 require the function calls to be |
| 230 | +collective. |
| 231 | + |
| 232 | +* Mode Switch Mechanism |
| 233 | + + PnetCDF -- when a file is in the data mode, it can be put into either |
| 234 | + collective or independent I/O mode. The default mode is collective I/O |
| 235 | + mode. Switching to and exiting from the independent I/O mode is done by |
| 236 | + explicitly calling `"ncmpi_begin_indep_data()"` and |
| 237 | + `"ncmpi_end_indep_data()"`. |
| 238 | + |
| 239 | + + NetCDF4 -- collective and independent mode switching is done per variable |
| 240 | + basis. Switching mode is done by explicitly calling `"nc_var_par_access()"` |
| 241 | + before accessing the variable. For more information, see |
| 242 | + [Parallel I/O with NetCDF-4](https://docs.unidata.ucar.edu/netcdf-c/current/parallel_io.html). |
| 243 | + |
| 244 | +<ul> |
| 245 | + <li> A PnetCDF example shows switching between collective and independent I/O |
| 246 | + modes.</li> |
| 247 | + <li> <details> |
| 248 | + <summary>Example code fragment (click to expand)</summary> |
| 249 | + |
| 250 | +```c |
| 251 | + #include <mpi.h> |
| 252 | + #include <pnetcdf.h> |
| 253 | + ... |
| 254 | + /* Create the file */ |
| 255 | + ncmpi_create(MPI_COMM_WORLD, filename, NC_CLOBBER, MPI_INFO_NULL, &ncid); |
| 256 | + |
| 257 | + ... |
| 258 | + /* Metadata operations to define dimensions and variables */ |
| 259 | + ... |
| 260 | + /* Exit the define mode (by default, into the collective I/O mode) */ |
| 261 | + ncmpi_enddef(ncid); |
| 262 | + |
| 263 | + /* Write to variables collectively */ |
| 264 | + ncmpi_put_vara_int_all(ncid, varid, start, count, buf_int); |
| 265 | + |
| 266 | + /* Read from variables collectively */ |
| 267 | + ncmpi_get_vara_float_all(ncid, var_flt, start, count, buf_flt); |
| 268 | + |
| 269 | + /* Leaving collective I/O mode and entering independent I/O mode */ |
| 270 | + ncmpi_begin_indep_data(ncid); |
| 271 | + |
| 272 | + /* Write to variables independently */ |
| 273 | + ncmpi_put_vara_int(ncid, varid, start, count, buf_int); |
| 274 | + |
| 275 | + /* Read from variables independently */ |
| 276 | + ncmpi_get_vara_float(ncid, var_flt, start, count, buf_flt); |
| 277 | + |
| 278 | + /* Close the file */ |
| 279 | + ncmpi_close(ncid); |
| 280 | +``` |
| 281 | +</details></li> |
| 282 | +</ul> |
| 283 | +
|
| 284 | +<ul> |
| 285 | + <li> A NetCDF4 example shows switching between collective and |
| 286 | + independent I/O modes.</li> |
| 287 | + <li> <details> |
| 288 | + <summary>Example code fragment (click to expand)</summary> |
| 289 | +
|
| 290 | +```c |
| 291 | + #include <mpi.h> |
| 292 | + #include <netcdf.h> |
| 293 | + #include <netcdf_par.h> |
| 294 | + ... |
| 295 | + /* Create the file */ |
| 296 | + nc_create_par(filename, NC_CLOBBER, MPI_COMM_WORLD, MPI_INFO_NULL, &ncid); |
| 297 | +
|
| 298 | + ... |
| 299 | + /* Metadata operations to define dimensions and variables */ |
| 300 | + ... |
| 301 | +
|
| 302 | + /* set the access method to use MPI collective I/O for all variables */ |
| 303 | + nc_var_par_access(ncid, NC_GLOBAL, NC_COLLECTIVE); |
| 304 | +
|
| 305 | + /* Write to variables collectively */ |
| 306 | + nc_put_vara_int(ncid, varid, start, count, buf_int); |
| 307 | +
|
| 308 | + /* Read from variables collectively */ |
| 309 | + nc_get_vara_float(ncid, var_flt, start, count, buf_flt); |
| 310 | +
|
| 311 | + /* set the access method to use MPI independent I/O for all variables */ |
| 312 | + nc_var_par_access(ncid, NC_GLOBAL, NC_INDEPENDENT); |
| 313 | +
|
| 314 | + /* Write to variables independently */ |
| 315 | + nc_put_vara_int(ncid, varid, start, count, buf_int); |
| 316 | +
|
| 317 | + /* Read from variables independently */ |
| 318 | + nc_get_vara_float(ncid, var_flt, start, count, buf_flt); |
| 319 | +
|
| 320 | + /* Close the file */ |
| 321 | + nc_close(ncid); |
| 322 | +``` |
| 323 | +</details></li> |
| 324 | +</ul> |
| 325 | + |
| 326 | +--- |
| 327 | + |
| 328 | +## Blocking vs Nonblocking APIs |
| 329 | +* Blocking APIs -- All NetCDF4 APIs are blocking APIs. A blocking API means the |
| 330 | + call to the API will not return until the operation is completed. For |
| 331 | + example, a call to `nc_put_var_float()` will return only when the write data |
| 332 | + has been stored at the system space, e.g. file systems. Similarly, a call to |
| 333 | + `nc_get_var_float()` will only return when the user read buffer containing |
| 334 | + the data retrieved from the file. Therefore, when a series of `put/get` |
| 335 | + blocking APIs are called, these calls will be committed by the NetCDF4 |
| 336 | + library one at a time, following the same order of the calls. |
| 337 | +* Nonblocking APIs -- In addition to blocking APIs, PnetCDF provides the |
| 338 | + nonblocking version of the APIs. A nonblocking API means the call to the API |
| 339 | + will return as soon as the `put/get` request has been registered in the |
| 340 | + PnetCDF library. The commitment of the request may happen later, when a call |
| 341 | + to `ncmpi_wait_all/ncmpi_wait` is made. The nonblocking APIs are listed below. |
| 342 | + + `ncmpi_iput_var_xxx()` - posts a nonblocking request to write to a variable. |
| 343 | + + `ncmpi_iget_var_xxx()` - posts a nonblocking request to from from a variable. |
| 344 | + + `ncmpi_bput_var_xxx()` - posts a nonblocking, buffered request to write to a variable. |
| 345 | + + `ncmpi_iput_varn_xxx()` - posts a nonblocking request to write multiple subarrays to a variable. |
| 346 | + + `ncmpi_iget_varn_xxx()` - posts a nonblocking request to read multiple subarrays from a variable. |
| 347 | + + `ncmpi_bput_varn_xxx()` - posts a nonblocking, buffered request to write multiple subarrays to a variable. |
| 348 | + + `ncmpi_wait_all()` - waits for nonblocking requests to complete, using collective MPI-IO. |
| 349 | + + `ncmpi_wait()` - waits for nonblocking requests to complete, using independent MPI-IO. |
| 350 | + + `ncmpi_attach_buff()` - Let PnetCDF to allocate an internal buffer to cache bput write requests. |
| 351 | + + `File.detach_buff()` - Free the attached buffer. |
| 352 | +* The advantage of using nonblocking APIs is when there are many small |
| 353 | + `put/get` requests and each of them has a small amount. PnetCDF tries to |
| 354 | + aggregate and coalesce multiple registered nonblocking requests into a large |
| 355 | + one, because I/O usually performs better when the request amounts are large |
| 356 | + and contiguous. See example programs |
| 357 | + [nonblocking_write.c](../examples/C/nonblocking_write.c) and |
| 358 | + [bput_varn_int64.c](../examples/C/bput_varn_int64.c). |
| 359 | +* Table below shows the difference in C programming between using blocking |
| 360 | + and nonblocking APIs. |
| 361 | + |
| 362 | +| PnetCDF Blocking APIs | PnetCDF Nonblocking APIs | |
| 363 | +|:-------|:--------| |
| 364 | +| ...<br>/* define 3 variables of NC_FLOAT type */ || |
| 365 | +| ncmpi_def_var(ncid, "PSFC", NC_FLOAT, 2, dimid, &psfc);<br>ncmpi_def_var(ncid, "PRCP", NC_FLOAT, 2, dimid, &prcp);<br>ncmpi_def_var(ncid, "SNOW", NC_FLOAT, 2, dimid, &snow); | ditto | |
| 366 | +| ... || |
| 367 | +| /* exit define mode and enter data mode */<br>ncmpi_enddef(ncid); | ditto | |
| 368 | +| ...<br>/* Call blocking APIs to write 3 variables to the file */ | <br>/* Call nonblocking APIs to post 3 write requests */ | |
| 369 | +| ncmpi_put_vara_float_all(ncid, psfc, start, count, buf_psfc);<br>ncmpi_put_vara_float_all(ncid, prcp, start, count, buf_prcp);<br> ncmpi_put_vara_float_all(ncid, snow, start, count, buf_snow);| ncmpi_iput_vara_float(ncid, psfc, start, count, buf_psfc, &req[0]);<br>ncmpi_iput_vara_float(ncid, prcp, start, count, buf_prcp, &req[1]);<br>ncmpi_iput_vara_float(ncid, snow, start, count, buf_snow, &req[2]);| |
| 370 | +| | /* Wait for nonblocking requests to complete */<br>ncmpi_wait_all(3, reqs, errs)| |
| 371 | + |
| 372 | + |
0 commit comments