You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+12-4Lines changed: 12 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -63,7 +63,7 @@ The fastest and most memory efficient lattice Boltzmann CFD software, running on
63
63
- made flag wireframe / solid surface visualization kernels toggleable with key <kbd>1</kbd>
64
64
- added surface pressure visualization (key <kbd>1</kbd> when `FORCE_FIELD` is enabled and `lbm.calculate_force_on_boundaries();` is called)
65
65
- added binary `.vtk` export function for meshes with `lbm.write_mesh_to_vtk(Mesh* mesh);`
66
-
- added `time_step_multiplicator` for `integrate_particles()` function in PARTICLES extension
66
+
- added `time_step_multiplicator` for `integrate_particles()` function in `PARTICLES` extension
67
67
- made correction of wrong memory reporting on Intel Arc more robust
68
68
- fixed bug in `write_file()` template functions
69
69
- reverted back to separate `cl::Context` for each OpenCL device, as the shared Context otherwise would allocate extra VRAM on all other unused Nvidia GPUs
@@ -236,6 +236,14 @@ The fastest and most memory efficient lattice Boltzmann CFD software, running on
236
236
- fixed bug in insertion-sort in `voxelize_mesh()` kernel causing crash on AMD GPUs
237
237
- fixed bug in `voxelize_mesh_on_device()` host code causing initialization corruption on AMD GPUs
238
238
- fixed dual CU and IPC reporting on AMD RDNA 1-4 GPUs
- optional [FP16S or FP16C compression](https://www.researchgate.net/publication/362275548_Accuracy_and_performance_of_the_lattice_Boltzmann_method_with_64-bit_32-bit_and_customized_16-bit_number_formats) for thermal DDFs with [DDF-shifting](https://www.researchgate.net/publication/362275548_Accuracy_and_performance_of_the_lattice_Boltzmann_method_with_64-bit_32-bit_and_customized_16-bit_number_formats)
448
456
- Smagorinsky-Lilly subgrid turbulence LES model to keep simulations with very large Reynolds number stable
- <details><summary>Does FluidX3D support adaptive mesh refinement?</summary><br>No, not yet. Grid cell size is the same everywhere in the simulation box.<br><br></details>
1712
1720
1713
-
- <details><summary>Can FluidX3D model both water and air at the same time?</summary><br>No. FluidX3D can model either water or air, but not both at the same time. For free surface simulations with the <a href="https://github.com/ProjectPhysX/FluidX3D/blob/master/DOCUMENTATION.md#surface-extension">`SURFACE` extension</a>, I went with a <a href="https://doi.org/10.3390/computation10060092">volume-of-fluid</a>/<a href="https://doi.org/10.3390/computation10020021">PLIC</a> modeling approach as that provides a sharp water-air interface, so individual droplets can be resolved as small as 3 grid cells in diameter. However this model ignores the gas phase completely, and only models the fluid phase with LBM as well as the surface tension. An alternative I had explored years ago was the <a href="http://dx.doi.org/10.1016/j.jcp.2022.111753">phase-field models</a> (simplest of them is Shan-Chen model) - they model both fluid and gas phases, but struggle with the 1:1000 density contrast of air:water, and the modeled interface is diffuse over ~5 grid cells. So the smallest resolved droplets are ~10 grid cells in diameter, meaning for the same resolution you need ~37x the memory footprint - infeasible on GPUs. Coming back to VoF model, it is possible to <a href="http://dx.doi.org/10.1186/s43591-023-00053-7">extend it with a model for the gas phase</a>, but one has to manually track bubble split/merge events, which makes this approach very painful in implementation and poorly performing on the hardware.<br><br></details>
1721
+
- <details><summary>Can FluidX3D model both water and air at the same time?</summary><br>No. FluidX3D can model either water or air, but not both at the same time. For free surface simulations with the <a href="https://github.com/ProjectPhysX/FluidX3D/blob/master/DOCUMENTATION.md#surface-extension">SURFACE extension</a>, I went with a <a href="https://doi.org/10.3390/computation10060092">volume-of-fluid</a>/<a href="https://doi.org/10.3390/computation10020021">PLIC</a> modeling approach as that provides a sharp water-air interface, so individual droplets can be resolved as small as 3 grid cells in diameter. However this model ignores the gas phase completely, and only models the fluid phase with LBM as well as the surface tension. An alternative I had explored years ago was the <a href="http://dx.doi.org/10.1016/j.jcp.2022.111753">phase-field models</a> (simplest of them is Shan-Chen model) - they model both fluid and gas phases, but struggle with the 1:1000 density contrast of air:water, and the modeled interface is diffuse over ~5 grid cells. So the smallest resolved droplets are ~10 grid cells in diameter, meaning for the same resolution you need ~37x the memory footprint - infeasible on GPUs. Coming back to VoF model, it is possible to <a href="http://dx.doi.org/10.1186/s43591-023-00053-7">extend it with a model for the gas phase</a>, but one has to manually track bubble split/merge events, which makes this approach very painful in implementation and poorly performing on the hardware.<br><br></details>
1714
1722
1715
1723
- <details><summary>Can FluidX3D compute lift/drag forces?</summary><br>Yes. See <ahref="https://github.com/ProjectPhysX/FluidX3D/blob/master/DOCUMENTATION.md#liftdrag-forces">the relevant section in the FluidX3D Documentation</a>!<br><br></details>
)+R(voidextract_F(const uint a, const uint A, const uxx n, global float* transfer_buffer, const global float* F) {
2203
+
transfer_buffer[ a] = F[ n];
2204
+
transfer_buffer[ A+a] = F[ def_N+(ulong)n];
2205
+
transfer_buffer[2u*A+a] = F[2ul*def_N+(ulong)n];
2206
+
}
2207
+
)+R(voidinsert_F(const uint a, const uint A, const uxx n, const global float* transfer_buffer, global float* F) {
2208
+
F[ n] = transfer_buffer[ a];
2209
+
F[ def_N+(ulong)n] = transfer_buffer[ A+a];
2210
+
F[2ul*def_N+(ulong)n] = transfer_buffer[2u*A+a];
2211
+
}
2212
+
)+R(kernel voidtransfer_extract_F(const uint direction, const ulong t, global float* transfer_buffer_p, global float* transfer_buffer_m, const global float* F) {
2213
+
const uint a=get_global_id(0), A=get_area(direction); // a = domain area index for each side, A = area of the domain boundary
2214
+
if(a>=A) return; // area might not be a multiple of cl_workgroup_size, so return here to avoid writing in unallocated memory space
2215
+
extract_F(a, A, index_extract_p(a, direction), transfer_buffer_p, F);
2216
+
extract_F(a, A, index_extract_m(a, direction), transfer_buffer_m, F);
2217
+
}
2218
+
)+R(kernel voidtransfer__insert_F(const uint direction, const ulong t, const global float* transfer_buffer_p, const global float* transfer_buffer_m, global float* F) {
2219
+
const uint a=get_global_id(0), A=get_area(direction); // a = domain area index for each side, A = area of the domain boundary
2220
+
if(a>=A) return; // area might not be a multiple of cl_workgroup_size, so return here to avoid writing in unallocated memory space
2221
+
insert_F(a, A, index_insert_p(a, direction), transfer_buffer_p, F);
2222
+
insert_F(a, A, index_insert_m(a, direction), transfer_buffer_m, F);
2223
+
}
2224
+
)+"#endif"+R( // FORCE_FIELD
2225
+
2178
2226
)+"#ifdef SURFACE"+R(
2179
2227
)+R(voidextract_phi_massex_flags(const uint a, const uint A, const uxx n, global char* transfer_buffer, const global float* phi, const global float* massex, const global uchar* flags) {
2180
2228
((global float*)transfer_buffer)[ a] = phi [n];
@@ -2966,10 +3014,11 @@ string opencl_c_container() { return R( // ########################## begin of O
2966
3014
)+R(kernel voidgraphics_particles(const global float* camera, global int* bitmap, global int* zbuffer, const global float* particles) {
2967
3015
const uxx n = get_global_id(0);
2968
3016
if(n>=(uxx)def_particles_N) return;
3017
+
const float3 p = (float3)(particles[n]-def_domain_offset_x, particles[def_particles_N+(ulong)n]-def_domain_offset_y, particles[2ul*def_particles_N+(ulong)n]-def_domain_offset_z);
0 commit comments