Open
Description
Its been a while since we did any good performance profiling, and we'd like to know performance profile of existing master before doing performance comparison of the new shared memory parallel versions.
Additionally, I watched this stunning talk by Emery Berger on performance profiling (Performance Matters) where he talks about a few of his profiling tools that take the statistical approach we need to be using in our performance analysis.
Discussion:
- What profilers?
- Suggest all of:
- grpof
- We (at least I have) been using this from the beginning, leverages our existing experiences with parflow performance.
- coz (plasma-umass/coz)
- Causal profiler.
- stabilizer (plasma-umass/stabilizer)
- Profiler that eliminates effect so address space layout (which can apparently have incredible effects on performance, and can be sourced from benign artifacts, such as long/short usernames)
- grpof
- Suggest all of:
- What test cases?
- Suggest all of:
- ClayL
- RU-Conus
- TFG-Conus
- Big Sinusoidal
- Suggest all of:
Deliverables:
- Profiling results from parflow/parflow/master
- Profiling results from hydroframe/ParFlow_PerfTeam/pf_cuda
- With neither CUDA or OpenMP enabled
- With CUDA enabled
- With OpenMP enabled
- (If OpenMP and CUDA be used together) With both CUDA and OpenMP enabled.
Done When:
- Profiling results for the implementations and test cases are uploaded here.