Releases: ProjectPhysX/FluidX3D
Releases · ProjectPhysX/FluidX3D
FluidX3D v2.5 (raytracing overhaul)
Raytracing overhaul:
- implemented light absorption in fluid for raytracing graphics (no performance impact, demo on YouTube)
- improved raytracing framerate when camera is inside fluid
- fixed skybox pole flickering artifacts
- refactored raytracing code
Other bug fixes:
- fixed bug where moving objects during re-voxelization would leave an erroneous trail of solid grid cells behind (increased mesh bounding box by 2 cells tolerance)
FluidX3D v2.4 (UI improvements)
UI improvements:
- added a help menu with key H that shows keyboard/mouse controls, visualization settings and simulation stats
- zoom control with keyboard is now keys +/- instead of ./,
- print camera settings in console is now key G instead of H
- a simple mouseclick now frees/locks the cursor additionally to key U
- if the grid resolution is set larger than memory capacity allows, an error will now be printed, suggesting the largest possible grid resolution, so users don't have to guess how large the grid can be
- all source files are now encoded in UTF-8
Minor optimizations:
- the allocation size for the transfer buffers is now the not the maximum of Ax/Ay/Az, but only the maximum of the areas that are actually communicated; saves a few MB VRAM in some occasions
- the transfer buffer for fi is now used as faster array of structures instead of structure of arrays; performance difference is negligible
- refactoring in smart_device_selection() function
- upgraded OpenCL-Wrapper: devices from the same vendor are now in the same OpenCL Context, allowing migration of Memory objects; event-driven synchronisation can now be used
Bug fixes:
- fixed bug in temperature equilibrium function for temperature extension; lattice speed of sound in D3Q7 is 1/2 and not 1/sqrt(3)
- made erroneous double literal in skybox color functions, which is a bug for Intel iGPUs, a float literal
- fixed bug in make.sh where multiple console parameters for multi-GPU device IDs would not get forwarded from the ./make.sh call to the bin/FluidX3D executable
- fixed bug in mouse rotation in Windows when cursor is free but kept getting centered during rotation
- fixed bug in interactive graphics where text labels on the right side of the screen would not get drawn on both left/right eye screens in VR mode
- fixed bug in LBM::voxelize_stl() size parameter standard initialization
FluidX3D v2.3 (particles)
Particle update:
- added particles with immersed-boundary method (either passive or 2-way-coupled, only supported with single-GPU)
- minor optimization to GPU voxelization algorithm (workgroup threads outside mesh bounding-box return after ray-mesh intersections have been found)
- displayed GPU memory allocation size is now fully accurate
- fixed bug in
write_line()function insrc/utilities.hpp - removed
.exefile extension for Linux/macOS - refactoring and cosmetics
FluidX3D v2.2 (velocity voxelization)
Velocity voxelization update:
- simulation of moving/rotating geometry is now possible, here is a demo
- added option to voxelize moving/rotating geometry on GPU, with automatic velocity initialization for each grid point based on center of rotation, linear velocity and rotational velocity
- cells that are converted from solid->fluid during re-voxelization now have their DDFs properly initialized
- added option to not auto-scale mesh during
read_stl(...), with negativesizeparameter
- added kernel for solid boundary rendering with marching-cubes
FluidX3D v2.1 (fast voxelization)
Fast GPU voxelization update:
- new algorithm for
.stlmesh GPU voxelization: ~500x faster now, from minutes to milliseconds - added unvoxelize kernel, to quickly remove all boundaries in the mesh bounding box.
- removed old hull voxelization algorithm
Old: naive GPU voxelization
- For each voxel in the 3D grid, cast a ray from the voxel center in an arbitrary direction, and check with all mesh triangles for intersection.
- Count the number of intersections.
- Odd number of intersections means the voxel is inside.
- Runtime: N³×Triangles
New: fast GPU voxelization
- Only for the 2D bottom layer of grid points, shoot vertical rays upward and check with all mesh triangles for intersection.
- The vertical rays pass through all voxels in the columns above, so these don't have to be checked for ray-mesh intersection at all.
- Store all intersection distances in a short array in registers.
- Sort this array with insertion sort.
- Iterate through the vertical column of voxels.
- The first voxel is inside/outside depending on odd/even total intersection count.
- Each time one of the stored distances in the sorted array is passed, switch inside/outside state.
- Optimizations
- Only check inside the bounding box of the mesh.
- Don't always start from the bottom (z-direction), but from the direction where the mesh bounding box has the smallest cross-section area, so the smallest number of ray-mesh intersections have to be tested.
- To avoid errors on the odd/even total number of intersections, shoot a second ray in the opposite direction and only count the intersection number. Both have to be odd for the bottom voxel to start in inside state.
- Runtime: N²×Triangles, if N=500, this is 500x faster than naive voxelization
Known issues:
- voxelization might not always produce binary identical results in multi-GPU (floating-point round-off on ray-triangle instersection distances may differ for different ray origin) --> fixed in v2.16!
FluidX3D v2.0 (multi-GPU upgrade)
Big multi-GPU Update:
- Multi-GPU simulations are now possible on a single node (PC/laptop/server), allowing to pool VRAM from multiple GPUs.
- Easy setup with minimal changes to the user: instead of
LBM lbm(Nx, Ny, Nz, nu, ...);, useLBM lbm(Nx, Ny, Nz, Dx, Dy, Dz, nu, ...);, withDx/Dy/Dzindicating how many domains (GPUs) in each spatial direction to use. By default, all identical GPUs will be automatically assigned their domains, however the GPUs can also be manually set with a list of their indices:./make.sh 2 6 3 4or/bin/FluidX3D 2 6 3 4. - All extensions are supported and validated to produce binary identical results compared to single-GPU simulations.
- Multi-GPU also works with non-identical GPUs, regardless of vendor. Yes, you can run FluidX3D on unholy combinations of Nvidia/AMD/Intel GPUs/CPUs at the same time. I only recommend similar memory capacity and bandwidth, as the weakest GPU will bottleneck performance.
- No SLI/Crossfire/NVLink/InfinityFabric is required. All communication runs over PCIe and is compatible with all hardware.
- No MPI installation is required.
- Total grid resolution must be equally divisible into domains, such that all domains are the same size.
- The resolution of each domain is restricted to 4.29 billion grid points (2³², 225GB VRAM), but domain number and thus total grid resolution is unrestricted.
- Under the hood: Complete re-write of C++ backend, to account for the domain decomposition architecture. The code is already fully optimized and shortened for maximum maintainability/upgradeability.
- Easy setup with minimal changes to the user: instead of
- Grid resolution can now be arbitrary and is not anymore restricted to the condition
(Nx*Ny*Nz)%WORKGROUP_SIZE==0.
Known issues:
- Raytracing graphics are disabled for multi-GPU. The simulated light rays would have to travel through the entire simulation box, crossing domain boundaries. This is not easily possible, because each GPU only keeps its own domain in VRAM.
FluidX3D v1.4 (Linux graphics)
- Big update for Linux users: Added interactive graphics mode on Linux with X11. No external dependencies, compiles out-of-the-box with the "compile on Linux with X11" command in
make.sh. - Re-wrote C++ graphics library to minimize API dependencies
- Colors are now signed int consistently.
- Fixed streamline visualization in 2D.
FluidX3D v1.3 (minor bug fixes)
- added OpenCL driver bug workaround for old AMD GPUs (binary number literals for flag bitmasks don't work, so change to hexadecimal literals)
FORCE_FIELDandVOLUME_FORCEcan now be used independently- added unit conversion functions for torque
FluidX3D v1.2 (force/torque computation)
- added functions to compute force/torque on objects
- added function to translate Mesh
- added Stokes drag validation setup
- added more benchmarks in Readme
FluidX3D v1.1 (GPU voxelization)
- added new GPU voxelization
- fixed broken triangle rendering with some Intel iGPUs (driver bug workaround in marching_cubes)
- added tool to print current camera position (key G)
- refactoring