Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] Fwd: backtrace of ChaNGa process

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] Fwd: backtrace of ChaNGa process


Chronological Thread 
  • From: Shad Kirmani <sxk5292 AT cse.psu.edu>
  • To: Pritish Jetley <pjetley2 AT illinois.edu>
  • Cc: charm AT cs.uiuc.edu, Jason Holmes <jholmes AT psu.edu>, Padma Raghavan <raghavan AT cse.psu.edu>
  • Subject: Re: [charm] [ppl] Fwd: backtrace of ChaNGa process
  • Date: Tue, 27 Mar 2012 11:51:03 -0400
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Hello Pritish,

I compiled charm++ (Cham-6.2) with
./build ChaNGa net-linux-x86_64 ibverbs -O3

and then did a 'make' on charm-6.2/tests/charm++/megatest.

I then ran the executable pgm on 64 cores. It agains hangs at the same place: 
Charmrun> Waiting for 62-th client to connect.
Charmrun> Waiting for 63-th client to connect.
Charmrun> All clients connected.
Charmrun> IP tables sent.
Charmrun> node programs all connected

If you are ready to wait long enough the code sometimes does progress and you get the following results:
Megatest is running on 64 nodes 64 processors. 
test 0: initiated [inlineem (phil)]
test 0: completed (0.01 sec)
test 1: initiated [callback (olawlor)]
test 1: completed (3.98 sec)
test 2: initiated [immediatering (gengbin)]
....
test 48: initiated [multi nodering (milind)]
test 48: completed (0.02 sec)
test 49: initiated [multi groupring (milind)]
test 49: completed (0.02 sec)
test 50: initiated [all-at-once]
test 50: completed (0.26 sec)
All tests completed, exiting
Charmrun> Graceful exit.


Thanks,
Shad

On Mon, Mar 26, 2012 at 4:43 PM, Pritish Jetley <pjetley2 AT illinois.edu> wrote:
Try "megatest" first. You'll find this suite of tests in: tests/charm++/megatest

Pritish

On Mon, Mar 26, 2012 at 3:30 PM, Shad Kirmani <sxk5292 AT cse.psu.edu> wrote:
> Hello Pritish,
>
> No I have not. I can try running the barnes code on this architecture. Or do
> you suggest running something more simpler? As you can see the output below,
> Charmrun hangs even before it enters the ChaNGa code, I do not think this is
> a code issue.
>
> Thanks,
> Shad
>
>
> On Mon, Mar 26, 2012 at 1:58 PM, Pritish Jetley <pjetley2 AT illinois.edu>
> wrote:
>>
>> Have you successfully run any other Charm++ programs on this architecture?
>>
>> Pritish
>>
>> On Mon, Mar 26, 2012 at 12:22 PM, Shad Kirmani <sxk5292 AT cse.psu.edu>
>> wrote:
>> > Hello,
>> >
>> > Sometimes at startup of ChaNGa compiled for ibverbs, the processes will
>> > hang
>> > for a long period of time at the beginning of the job.  A backtrace of a
>> > process looks like this:
>> >
>> > #0  0x00000038daa0b795 in pthread_spin_lock () from
>> > /lib64/libpthread.so.0
>> > #1  0x00002b93ecee7a7b in ibv_cmd_create_qp ()
>> >   from /usr/lib64/libmlx4-rdmav2.so
>> > #2  0x000000000061add0 in recvBarrierMessage ()
>> > #3  0x000000000061b882 in CmiBarrier ()
>> > #4  0x00000000006206ec in CmiTimerInit ()
>> > #5  0x00000000006216ec in ConverseCommonInit ()
>> > #6  0x000000000061d723 in ConverseInit ()
>> > #7  0x00000000005afd4c in main ()
>> >
>> > With the verbose flag added to charmrun, the hang occurs right after it
>> > says
>> > that all nodes are connected:
>> >
>> > ...
>> > Charmrun> Waiting for 62-th client to connect.
>> > Charmrun> Waiting for 63-th client to connect.
>> > Charmrun> All clients connected.
>> > Charmrun> IP tables sent.
>> > Charmrun> node programs all connected
>> >
>> > We did not see these hangs when ChaNGa was compiled for MPI-linux-x86_64
>> > instead of net-linux-x86_64 with ibverbs.  When the hang occurs, it can
>> > either go away after a period of time and the job runs or it just hangs
>> > long
>> > enough that we give up and kill it.
>> >
>> > This is on a RedHat Enterprise Linux 5 system using libibverbs-1.1.3-2.
>> >
>> > Thanks,
>> > Shad
>> >
>> >
>> > _______________________________________________
>> > charm mailing list
>> > charm AT cs.uiuc.edu
>> > http://lists.cs.uiuc.edu/mailman/listinfo/charm
>> >
>> > _______________________________________________
>> > ppl mailing list
>> > ppl AT cs.uiuc.edu
>> > http://lists.cs.uiuc.edu/mailman/listinfo/ppl
>> >
>>
>>
>>
>> --
>> Pritish Jetley
>> Doctoral Candidate, Computer Science
>> University of Illinois at Urbana-Champaign
>
>



--
Pritish Jetley
Doctoral Candidate, Computer Science
University of Illinois at Urbana-Champaign




Archive powered by MHonArc 2.6.16.

Top of Page