Skip to Content.
Sympa Menu

charm - Re: [charm] TAU crashes on simple Charm++ program

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] TAU crashes on simple Charm++ program


Chronological Thread 
  • From: Eric Bohm <ebohm AT illinois.edu>
  • To: <charm AT cs.uiuc.edu>
  • Subject: Re: [charm] TAU crashes on simple Charm++ program
  • Date: Wed, 22 Jul 2015 10:29:02 -0500
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Hello,

The Tau build is an experimental one that is not widely used.   It may take us a little while to unravel the issue you ran across.  The typical process for analyzing performance of Charm++ application relies on the Projections tool (http://charm.cs.illinois.edu/manuals/html/projections/manual.html).  Please let us know if that is sufficient for your purposes.

On 07/21/2015 11:08 AM, Hynninen, Antti-Pekka wrote:

I compiled Charm++ (6.6.1) with TAU (2.24.0) on Titan using:

./build charm++ gemini_gni-crayxe-persistent-smp -optimize
./build Tau gemini_gni-crayxe-persistent-smp --tau-makefile=$TAU_MAKEFILE --no-build-shared -optimize

I then compile the "simplearrayhello" example from Charm++ (6.6.1) library using:
make OPTS='-tracemode Tau'

When I run the example with more than one thread it crashes:

--------------------------------------------------------------
hynninen@titan-login7:/lustre/atlas/scratch/hynninen/stf006> aprun -n1 -N1 -d16 ./hello +ppn 2
Charm++> Running on Gemini (GNI) with 1 processes
Charm++> static SMSG
Charm++> memory pool init block size: 8MB, total memory pool limit 0MB (0 means no limit)
Charm++> memory pool registered memory limit: 200000MB, send limit: 100000MB
Charm++> only comm thread send/recv messages
Charm++> Cray TLB page size: 8192K
Charm++> Running in SMP mode: numNodes 1,  2 worker threads per process
Charm++> The comm. thread both sends and receives messages
Converse/Charm++ Commit ID: v6.6.1-0-g74a2cc5-namd-charm-6.6.1-build-2015-Mar-15-209687
Trace: traceroot: ./hello
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (16-way SMP).
Running Hello on 2 processors for 5 elements
Hello 0 created
Hello 1 created
Hello 2 created
Hello 3 created
Hello 4 created
Hi[17] from element 0
Hi[18] from element 1
[19071:0-0] TAU: Runtime overlap: found Hello::SayHi(int hiNo)::155 (0x2aab741ae790) on the stack, but stop called on Idle (0x100000a43b0)
Hi[19] from element 2
./hello() [0x201c126b]
./hello(Tau_stop_timer+0x196) [0x201c4796]
./hello(traceCommonEndIdle+0x57) [0x200c5237]
./hello(CcdRaiseCondition+0xf5) [0x20197b35]
./hello(CsdScheduleForever+0xf2) [0x20190ff2]
./hello(CsdScheduler+0x2d) [0x201912ed]
./hello() [0x2018f282]
./hello() [0x2018f705]
/lib64/libpthread.so.0(+0x7806) [0x2aaaaaeea806]
/lib64/libc.so.6(clone+0x6d) [0x2aaaafcb964d]
TAU: signal 6 on 0 - calling TAU_PROFILE_EXIT()...
TAU: done.
Application 8873155 exit codes: 1
Application 8873155 resources: utime ~0s, stime ~1s, Rss ~15900, inblocks ~11825, outblocks ~26705
--------------------------------------------------------------

The crash happens at program exit before trace/profile is written. Trace data is written for thread 0, but not for other threads. The example runs fine when using a single thread and when not profiling with TAU.

Similar behavior is observed with all other Charm++ programs I have tried (e.g. NAMD and self-written "hello world").

--
Antti-Pekka Hynninen
phone: 865-241-6123
Scientific Computing
Oak Ridge National Laboratory


_______________________________________________
charm mailing list
charm AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/charm




Archive powered by MHonArc 2.6.16.

Top of Page