Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] ampi app with mpi-io

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] ampi app with mpi-io


Chronological Thread 
  • From: Gengbin Zheng <zhenggb AT gmail.com>
  • To: Jim Edwards <jedwards AT ucar.edu>
  • Cc: charm AT cs.uiuc.edu
  • Subject: Re: [charm] [ppl] ampi app with mpi-io
  • Date: Fri, 15 Jul 2011 13:29:41 -0500
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

This might be caused by mixing mpich and ampi libraries.
I hacked the romio library in AMPI this morning, now I get the
testmpiio program working on 1 processor. Running on two processors
the program still hangs in MPI_FILE_OPEN. Not sure what is happening.

Same code works with openmpi on my desktop.

Gengbin

On Thu, Jul 14, 2011 at 2:14 PM, Jim Edwards
<jedwards AT ucar.edu>
wrote:
> So I downloaded the charm cvs trunk and rebuilt but I'm still stuck in the
> same place:
>
>
>
> Running on 2 processors:  ./testmpiio +tcharm_stacksize 10000000
> aprun -n 2 ./testmpiio +tcharm_stacksize 10000000
> _pmii_daemon(SIGCHLD): [NID 00026] PE 1 exit signal Segmentation fault
> [NID 00026] 2011-07-14 13:12:08 Apid 157441: initiated application
> termination
> Application 157441 exit codes: 139
> Application 157441 resources: utime 0, stime 0
>
>
> On Thu, Jul 14, 2011 at 9:50 AM, Gengbin Zheng
> <zhenggb AT gmail.com>
> wrote:
>>
>> hmm. it should be recognized by the binary
>>
>> Gengbin
>>
>> On Thu, Jul 14, 2011 at 10:39 AM, Jim Edwards
>> <jedwards AT ucar.edu>
>> wrote:
>> >
>> >
>> > On Thu, Jul 14, 2011 at 9:29 AM, Gengbin Zheng
>> > <zhenggb AT gmail.com>
>> > wrote:
>> >>
>> >> Jim,
>> >>
>> >> The code declares a 4MB buf on stack (thanks to Orion who spotted
>> >> this). Note that AMPI thread is running in the context of a user level
>> >> thread, which by default has a stack size of 1MB. You can expand the
>> >> stacksize by say to 10MB:
>> >>
>> >> +tcharm_stacksize  10000000
>> >>
>> > I'm supposed to add that to the charmrun command?   It doesn't recognize
>> > it
>> > and is passing it on to aprun.
>> >
>> >
>> >
>> >
>> >>
>> >> Gengbin
>> >>
>> >> On Thu, Jul 14, 2011 at 7:55 AM, Jim Edwards
>> >> <jedwards AT ucar.edu>
>> >> wrote:
>> >> > I found that the link step wasn't including libmpio.a, so I added
>> >> > that
>> >> > and
>> >> > now I get even less ...
>> >> >
>> >> > Running on 2 processors:  /glade/scratch/jedwards/pnettest/testmpiio
>> >> > aprun -n 2 /glade/scratch/jedwards/pnettest/testmpiio
>> >> > _pmii_daemon(SIGCHLD): [NID 00031] PE 1 exit signal Segmentation
>> >> > fault
>> >> > [NID 00031] 2011-07-14 06:51:35 Apid 157339: initiated application
>> >> > termination
>> >> > Application 157339 exit codes: 139
>> >> > Application 157339 resources: utime 0, stime 0
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Jul 13, 2011 at 9:18 PM, Jim Edwards
>> >> > <jedwards AT ucar.edu>
>> >> > wrote:
>> >> >>
>> >> >> On Wed, Jul 13, 2011 at 8:38 PM, Gengbin Zheng
>> >> >> <zhenggb AT gmail.com>
>> >> >> wrote:
>> >> >>>
>> >> >>> Hi Jim,
>> >> >>>
>> >> >>>  If you compile charm/AMPI and test code with -g, you can get stack
>> >> >>> trace with command:   addr2line
>> >> >>>
>> >> >>
>> >> >> Thanks, here it is:
>> >> >> libalpsutil.c:0
>> >> >> ??:0
>> >> >> ??:0
>> >> >> ??:0
>> >> >> /glade/scratch/jedwards/pnettest/testmpiio2.F90:66
>> >> >>
>> >> >>
>> >> >>
>> >> >> /glade/home/jedwards/src/charm-6.2/mpi-crayxt/tmp/libs/ck-libs/ampi/ampi.C:404
>> >> >>
>> >> >>
>> >> >>
>> >> >> /glade/home/jedwards/src/charm-6.2/mpi-crayxt/tmp/libs/ck-libs/ampi/ampi.C:554
>> >> >>
>> >> >>
>> >> >>
>> >> >> /glade/home/jedwards/src/charm-6.2/mpi-crayxt/tmp/libs/ck-libs/ampi/ampi.C:570
>> >> >>
>> >> >>
>> >> >>
>> >> >> /glade/home/jedwards/src/charm-6.2/mpi-crayxt/tmp/libs/ck-libs/tcharm/tcharm.C:139
>> >> >> /glade/home/jedwards/src/charm-6.2/mpi-crayxt/tmp/threads.c:1850
>> >> >> md/setjmp64.c:57
>> >> >> md/setjmp64.c:70
>> >> >>
>> >> >>
>> >> >>>
>> >> >>>  Could you also please send us your test program? We can take a
>> >> >>> look
>> >> >>> here.
>> >> >>>
>> >> >> The program is attached, I compile with -DAMPI and have hardcoded
>> >> >> the
>> >> >> path
>> >> >> to the output file.
>> >> >>
>> >> >>
>> >> >>>
>> >> >>> Gengbin
>> >> >>>
>> >> >
>> >> >
>> >
>> >
>
>





Archive powered by MHonArc 2.6.16.

Top of Page