-
Notifications
You must be signed in to change notification settings - Fork 71
Task Profiler
Justus Calvin edited this page May 24, 2014
·
1 revision
MADNESS includes a task profiler, which is an event-based profiler. It records submit, start, and stop times for all tasks run by the program, which is used to create a task trace files. Task profiles can then be generated with these trace files.
To enable the task profiler, configure MADNESS with the --enable-task-profiler option.
When the task profiler is enabled, the task manager will automatically collect timing information for all tasks. The data is written to a text file before the program exits. The output file is specified with the MAD_TASKPROFILER_NAME environment variable. For example:
export MAD_TASKPROFILER_NAME=<file name>or for C shell
setenv MAD_TASKPROFILER_NAME <file name>If this environment variable is not specified, the timing data is still collected but no task profile data will be output. You will also see the following warning at the beginning of your program.
!!! WARNING: MAD_TASKPROFILER_NAME not set. !!! WARNING: There will be no task profile output.The output file specified will be appended with MPI rank and the number of threads in your application. For example, if your file name is “world_profile” and you are using 8 threads on 2 nodes, the following files will be generated:
world_profile_0x8 world_profile_1x8one on each node.
You can easily generate a flat profile with the taskprofile.pl script (located in src/bin):
$ perl taskprofile.pl world_profile_*Which will parse all files that match world_profile_* and combine the data. A sample profile from src/lib/world/world is given below.
Wall time: 5.122230 (s)
Total task time: 1.583669 (s)
Time (s)
name counts total average % latency
val1d_func(int)
100000 1.4894500 0.0000149 29 0.0000174
val_func()
100000 0.0634100 0.0000006 1 0.0000123
null_func()
100000 0.0158480 0.0000002 0 0.0000205
pounder(madness::WorldContainer<int, Mary, madness::Hash<int> > const&, int)
7 0.0142600 0.0020371 0 0.0000500
TTT::kate(madness::World*, std::string const&, double)
2 0.0002130 0.0001065 0 0.0000790
TTT::mary()
1 0.0001820 0.0001820 0 0.0000540
TTT::carl()
1 0.0001450 0.0001450 0 0.0000610
UNKNOWN
1 0.0000430 0.0000430 0 0.0000420
TTT::fred()
1 0.0000260 0.0000260 0 0.0000590
madness::tr1::detail::memfunc_traits<double (Mary::*)(std::string const&, int, double)>::result_type madness::WorldContainerImpl<int, Mary, madness::Hash<int> >::itemfun<double (Mary::*)(std::string const&, int, double), std::string, int, double>(int const&, double (Mary::*)(std::string const&, int, double), std::string const&, int const&, double const&)
1 0.0000200 0.0000200 0 0.0000100
madness::tr1::detail::memfunc_traits<std::string (Mary::*)(int, int)>::result_type madness::WorldContainerImpl<int, Mary, madness::Hash<int> >::itemfun<std::string (Mary::*)(int, int), int, int>(int const&, std::string (Mary::*)(int, int), int const&, int const&)
1 0.0000200 0.0000200 0 0.0000400
TTT::jody(double, double, double)
1 0.0000050 0.0000050 0 0.0000580
TTT::hugh(std::vector<madness::Future<int>, std::allocator<madness::Future<int> > >&)
1 0.0000050 0.0000050 0 0.0000170
Foo::get3c(int, char, short) const
1 0.0000040 0.0000040 0 0.0000140
TTT::dave(madness::World*)
1 0.0000040 0.0000040 0 0.0000150
TTT::sara(double, std::complex<double> const&)
1 0.0000040 0.0000040 0 0.0000690
Foo::get0f()
1 0.0000030 0.0000030 0 0.0000170
dumb(int, int, int, int, int, int, int)
1 0.0000030 0.0000030 0 0.0000080
Foo::get0()
1 0.0000030 0.0000030 0 0.0000140
TTT::bert(int)
1 0.0000030 0.0000030 0 0.0000380
Foo::get1c(int) const
1 0.0000020 0.0000020 0 0.0000130
Foo::get4c(int, char, short, long) const
1 0.0000020 0.0000020 0 0.0000160
Foo::get4(int, char, short, long)
1 0.0000020 0.0000020 0 0.0000130
Foo::get5c(int, char, short, long, short) const
1 0.0000020 0.0000020 0 0.0000130
Foo::get3(int, char, short)
1 0.0000020 0.0000020 0 0.0000130
Foo::get1(int)
1 0.0000020 0.0000020 0 0.0000220
Foo::get0c() const
1 0.0000020 0.0000020 0 0.0000160
Foo::get2(int, char)
1 0.0000020 0.0000020 0 0.0000130
Foo::get2c(int, char) const
1 0.0000010 0.0000010 0 0.0000140
Foo::get5(int, char, short, long, short)
1 0.0000010 0.0000010 0 0.0000130
0x0
7 0.0000000 0.0000000 0 0.0000757