Skip to Content.
Sympa Menu

charm - [charm] TAU crashes on simple Charm++ program

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] TAU crashes on simple Charm++ program


Chronological Thread 
  • From: "Hynninen, Antti-Pekka" <hynninena AT ornl.gov>
  • To: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: [charm] TAU crashes on simple Charm++ program
  • Date: Tue, 21 Jul 2015 16:08:26 +0000
  • Accept-language: en-US
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>


I compiled Charm++ (6.6.1) with TAU (2.24.0) on Titan using:

./build charm++ gemini_gni-crayxe-persistent-smp -optimize
./build Tau gemini_gni-crayxe-persistent-smp --tau-makefile=$TAU_MAKEFILE --no-build-shared -optimize

I then compile the "simplearrayhello" example from Charm++ (6.6.1) library using:
make OPTS='-tracemode Tau'

When I run the example with more than one thread it crashes:

--------------------------------------------------------------
hynninen@titan-login7:/lustre/atlas/scratch/hynninen/stf006> aprun -n1 -N1 -d16 ./hello +ppn 2
Charm++> Running on Gemini (GNI) with 1 processes
Charm++> static SMSG
Charm++> memory pool init block size: 8MB, total memory pool limit 0MB (0 means no limit)
Charm++> memory pool registered memory limit: 200000MB, send limit: 100000MB
Charm++> only comm thread send/recv messages
Charm++> Cray TLB page size: 8192K
Charm++> Running in SMP mode: numNodes 1,  2 worker threads per process
Charm++> The comm. thread both sends and receives messages
Converse/Charm++ Commit ID: v6.6.1-0-g74a2cc5-namd-charm-6.6.1-build-2015-Mar-15-209687
Trace: traceroot: ./hello
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (16-way SMP).
Running Hello on 2 processors for 5 elements
Hello 0 created
Hello 1 created
Hello 2 created
Hello 3 created
Hello 4 created
Hi[17] from element 0
Hi[18] from element 1
[19071:0-0] TAU: Runtime overlap: found Hello::SayHi(int hiNo)::155 (0x2aab741ae790) on the stack, but stop called on Idle (0x100000a43b0)
Hi[19] from element 2
./hello() [0x201c126b]
./hello(Tau_stop_timer+0x196) [0x201c4796]
./hello(traceCommonEndIdle+0x57) [0x200c5237]
./hello(CcdRaiseCondition+0xf5) [0x20197b35]
./hello(CsdScheduleForever+0xf2) [0x20190ff2]
./hello(CsdScheduler+0x2d) [0x201912ed]
./hello() [0x2018f282]
./hello() [0x2018f705]
/lib64/libpthread.so.0(+0x7806) [0x2aaaaaeea806]
/lib64/libc.so.6(clone+0x6d) [0x2aaaafcb964d]
TAU: signal 6 on 0 - calling TAU_PROFILE_EXIT()...
TAU: done.
Application 8873155 exit codes: 1
Application 8873155 resources: utime ~0s, stime ~1s, Rss ~15900, inblocks ~11825, outblocks ~26705
--------------------------------------------------------------

The crash happens at program exit before trace/profile is written. Trace data is written for thread 0, but not for other threads. The example runs fine when using a single thread and when not profiling with TAU.

Similar behavior is observed with all other Charm++ programs I have tried (e.g. NAMD and self-written "hello world").

--
Antti-Pekka Hynninen
email: hynninena AT ornl.gov
phone: 865-241-6123
Scientific Computing
Oak Ridge National Laboratory



Archive powered by MHonArc 2.6.16.

Top of Page