Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] Using Charm AMPI

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] Using Charm AMPI


Chronological Thread 
  • From: Jim Phillips <jim AT ks.uiuc.edu>
  • To: Leonardo Duarte <leo.duarte AT gmail.com>
  • Cc: Scott Field <sfield AT astro.cornell.edu>, Charm Mailing List <charm AT cs.illinois.edu>
  • Subject: Re: [charm] [ppl] Using Charm AMPI
  • Date: Thu, 29 Oct 2015 16:14:51 -0500 (CDT)


Be sure you are explicitly setting +commap and +pemap. If you don't do this you can end up with all of your threads on the same core.

I also recommend PrgEnv-gnu, as the underlying gcc compilers have thousands of times the number of users as the Cray compilers. The Cray Fortran compiler is no doubt decades ahead of gnu, but not C++, and I wouldn't be surprised if the Cray malloc makes assumptions that destroy Charm++ performance.

Jim


On Thu, 29 Oct 2015, Leonardo Duarte wrote:

Hello Scott, thanks for your help.

I also used the swap to PrgEnv-gnu, the hugepages8M, rca, and the
persistent to build charm. It worked but it was extremely slow.
A simple example runs in secs in my laptop with 2 processors (simulating 2
nodes) and runs in 10 min in 2 nodes of BW.
Of course that I was expecting it to be slower but not this much.
That's why I decided to use the PrgEnv-cray environment, it's the native
language.

The AMPI does not support +ppn 30. It takes from the aprun parameters.

My startup line with only 2 nodes and only 1 worker threads per process is
not wrong.
Since I was having trouble to run it, I simplified the example to
understand better what was going on.

However, it's good to know that your application uses PrgEnv-gnu.
I was worried that mine was too slow because I was using it, or because I
was missing something to build it.

I really want to make it work with PrgEnv-cray right now, but I don't know
what I'm doing wrong.

Thanks for your answer!
Leonardo.

On Thu, Oct 29, 2015 at 11:00 AM, Scott Field
<sfield AT astro.cornell.edu>
wrote:

Hi Leonardo,

I have a charm++ application running on blue waters, and hopefully some
of this will carry over to AMPI.

In addition to the default blue waters environment, I use

module swap PrgEnv-cray PrgEnv-gnu/5.2.40
module load craype-hugepages2M
module load rca

and my charm++ build includes the option "persistent". To launch the
application I do

aprun -n 2 -r 1 -N 1 -d 31 ./ExecutableName +ppn 30 +pemap 1-30
+commap 0

On startup, my charm++ output looks different from yours. In particular, I
see

"Charm++> Running in SMP mode: numNodes 2, 30 worker threads per process"

while yours reads

"*Charm++> Running in SMP mode: numNodes 2, 1 worker threads per
process"*

These differences may or may not explain the errors you see. Hopefully it
helps. Good luck!

Scott


On Thu, Oct 29, 2015 at 1:58 AM, Leonardo Duarte
<leo.duarte AT gmail.com>
wrote:

Hello Everyone,

I'm a PhD student at the CEE department of UIUC and I would
really appreciate if anyone could help me with Charm.

I'm trying to run my code on Blue Waters and I'm using a library that
uses Charm++ AMPI.
I was able to build and run everything correctly but extremely slow with
PrgEnv-gnu.
Now I'm trying to use the native Cray environment.

I'm using this BW environment and modules:

*PrgEnv-cray*
*module load craype-hugepages8M*
*module load rca*

I built charm with this command line:

*./build LIBS gni-crayxe craycc smp -j16 --with-production
--build-shared -O3*

My code is composed by a lot of shared libraries that are loaded
dynamically by the application using dlopen, dlsym and etc.

I'm able to build my code using this command lines on my makefiles:

To compile code that do not use Charm:

*CC -c -fPIC -O2 -I../../core/include -I../../tecgraf/tops/include -o
../../obj/obj64/linear/Linux3/linear.o
../../plugins/behavior/linear/linear.cpp*

To link code that do not use Charm:
*CC -shared -Wl,-soname,liblinear.so.1 -o liblinear.so.1.0
../../obj/obj64/linear/Linux3/linear.o -L../../tecgraf/tops/lib64/Linux3
-ltops -L../../bin/lib64/Linux3 -ltopsim*

To compile code that uses Charm:
*charmc -language model -c -fPIC -O2 -I../../core/include
-I../../tecgraf/tops/include -I../../tecgraf/tops/include/vis
-I../../../bin/charm/include -o
../../obj/obj64/parebepcg/Linux3/parebepcg.o
../../plugins/linearsystem/ebepcg/parebepcg.cpp*

To link code that uses Charm:

*charmc -shared -language ampi -Wl,-soname,libparebepcg.so.1 -o
libparebepcg.so.1.0 ../../obj/obj64/parebepcg/Linux3/parebepcg.o
-L../../tecgraf/tops/lib64/Linux3 -lpartops -ltopsrd -ltops
-L../../bin/lib64/Linux3 -lpartopsim*

To compile my app:
*charmc -language model -c -fPIC -O2 -I../../core/include
-I../../tecgraf/tops/include -I../../tecgraf/tops/include/vis
-I../../plugins -o
../../obj/obj64/partopsimapp/partopsimapp/Linux3/parmain.o
../../tests/app/parmain.cpp*

To link my app:
*charmc -language ampi -dynamic -o ../../bin/lib64/Linux3/partopsimapp
../../obj/obj64/partopsimapp/partopsimapp/Linux3/parmain.o
-L../../tecgraf/tops/lib64/Linux3 -lpartops -ltopsrd -ltops
-L../../bin/lib64/Linux3 -lpartopsim -lpartopsimlib -Wl, --no-as-needed
-ldl*

This is the error that I get:

*_pmiu_daemon(SIGCHLD): [NID 16828] [c19-9c1s1n0] [Thu Oct 29 00:35:04
2015] PE RANK 0 exit signal Segmentation fault*
*[NID 16828] 2015-10-29 00:35:04 Apid 28607883: initiated application
termination*
*_pmiu_daemon(SIGCHLD): [NID 16829] [c19-9c1s1n1] [Thu Oct 29 00:35:04
2015] PE RANK 1 exit signal Segmentation fault*

I put some extra infos at the end of the email if you need.
I read a lot of things on the internet and I've been trying a lot but
know I think I need some help.
Am I missing something? Is this the correct way handle it?
I really appreciate any suggestions.

Thank you.
Leonardo.

Extra infos

These are my environment variables:

echo $PATH
*.:/u/psp/duarte/bin/lua5:/u/psp/duarte/bin/tolua5:/u/psp/duarte/bin/charm/gni-crayxe-smp-craycc/bin:/u/psp/duarte/bin/charm/gni-crayxe-persistent-smp/bin:/sw/xe/darshan/2.3.0/darshan-2.3.0_cle52/bin:/sw/admin/scripts:/sw/user/scripts:/sw/xe/altd/bin:/usr/local/gsi-openssh-6.2p2-2/bin:/opt/java/jdk1.7.0_45/bin:/usr/local/globus-5.2.4/bin:/usr/local/globus-5.2.4/sbin:/opt/moab/8.1/bin:/opt/moab/8.1/sbin:/opt/torque/5.0.2-bwpatch/sbin:/opt/torque/5.0.2-bwpatch/bin:/opt/cray/mpt/7.2.0/gni/bin:/opt/cray/rca/1.0.0-2.0502.53711.3.125.gem/bin:/opt/cray/alps/5.2.1-2.0502.9041.11.6.gem/sbin:/opt/cray/alps/5.2.1-2.0502.9041.11.6.gem/bin:/opt/cray/dvs/2.5_0.9.0-1.0502.1873.1.142.gem/bin:/opt/cray/xpmem/0.1-2.0502.55507.3.2.gem/bin:/opt/cray/dmapp/7.0.1-1.0502.9501.5.211.gem/bin:/opt/cray/pmi/5.0.6-1.0000.10439.140.3.gem/bin:/opt/cray/ugni/5.0-1.0502.9685.4.24.gem/bin:/opt/cray/udreg/2.3.2-1.0502.9275.1.25.gem/bin:/opt/cray/cce/8.3.10/cray-binutils/x86_64-unknown-linux-gnu/bin:/opt/!
cr
ay/cce/8.3.10/craylibs/x86-64/bin:/opt/cray/cce/8.3.10/cftn/bin:/opt/cray/cce/8.3.10/CC/bin:/opt/cray/craype/2.3.0/bin:/opt/cray/eslogin/eswrap/1.1.0-1.020200.1231.0/bin:/opt/modules/3.2.10.3/bin:/u/psp/duarte/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/usr/lib/mit/bin:/usr/lib/mit/sbin:/usr/lib/qt3/bin:/opt/cray/bin
<https://urldefense.proofpoint.com/v2/url?u=http-3A__3.2.10.3_bin-3A_u_psp_duarte_bin-3A_usr_local_bin-3A_usr_bin-3A_bin-3A_usr_bin_X11-3A_usr_X11R6_bin-3A_usr_games-3A_usr_lib_mit_bin-3A_usr_lib_mit_sbin-3A_usr_lib_qt3_bin-3A_opt_cray_bin&d=BQMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=x3NNBo1sW-0Ro900LIARhw_4yZMfh7AfgFTrqQHfc5M&m=yfIBSUZgI5UZf-g-INH-575Zn6hB6aTHomfsZcaiw0E&s=b5T-hpgum8IgkefJGD6l8AO8ORe9UOXiLb4mYaTGNTA&e=>*

echo $LD_LIBRARY_PATH

*.:/u/psp/duarte/topsim/bin/lib64/Linux3:/u/psp/duarte/topsim/bin/libd64/Linux3:/u/psp/duarte/bin/charm/gni-crayxe-smp-craycc/lib_so:/u/psp/duarte/bin/charm/gni-crayxe-smp-craycc/lib:/u/psp/duarte/bin/charm/gni-crayxe-persistent-smp/lib:/u/psp/duarte/lib:/sw/xe/darshan/2.3.0/darshan-2.3.0_cle52/lib:/usr/local/globus-5.2.4/lib64:/usr/local/globus/lib64*


My app output:

*Charm++> memory pool registered memory limit: 200000MB, send limit:
100000MB*
*Charm++> only comm thread send/recv messages*
*Charm++> Cray TLB page size: 8192K*
*Charm++> Running in SMP mode: numNodes 2, 1 worker threads per process*
*Charm++> The comm. thread both sends and receives messages*
*Converse/Charm++ Commit ID: v6.6.1-0-g74a2cc5*
*CharmLB> Load balancer assumes all CPUs are same.*
*Charm++> Running on 2 unique compute nodes (32-way SMP).*
**** Topsim 0.1.0 ****
*[0] topParInit() registered*
*[0] TopParContext created: 0!*
*[0] topParInit() array created*
*[1] TopParContext created: 1!*
*[1] topParInit() registered*
*[1] topParInit() array created*
*[0] topParInit() done!*
*[1] topParInit() done!*
*[0] PARTOPS: Slave started at processor 0, node: 0, rank: 0.*
*[0] PARTOPS: MODEL CREATED! rank: 0*
*[1] PARTOPS: Slave started at processor 1, node: 1, rank: 0.*
*[1] PARTOPS: MODEL CREATED! rank: 0*
*Plugin loaded libparebepcg.so*
*Plugin loaded libpartreader.so*
*Plugin loaded libisotropic.so*
*Plugin loaded liblinear.so*
*Plugin loaded libparsimp.so*
*Plugin loaded libbrick.so*
*Plugin loaded libpartreader.so*
*Plugin loaded libparebepcg.so*
*Plugin loaded libparloadcontrol.so*
*Plugin loaded libparwriter.so*
*Plugin loaded libparsimp.so*
*Plugin loaded libparjacobi.so*
*Plugin loaded libbrick.so*
*Plugin loaded libparwriter.so*
*Plugin loaded liblinear.so*
*Plugin loaded libisotropic.so*
*Plugin loaded libparloadcontrol.so*
*Plugin loaded libparjacobi.so*
*Application 28607883 exit codes: 139*
*Application 28607883 resources: utime ~2s, stime ~2s, Rss ~15384,
inblocks ~10927, outblocks ~18489*
*Thu Oct 29 00:35:04 CDT 2015*

This is my PBS script

#!/bin/bash
### set the number of nodes
### set the number of PEs per node
#PBS -l nodes=2:ppn=1:xe
### set the wallclock time
#PBS -l walltime=00:20:00
### set the job name
#PBS -N topsim
### set the job stdout and stderr
#PBS -e topsim.err
#PBS -o topsim.out
### set email notification
#PBS -m bea
#PBS -M
leo.duarte AT gmail.com
### In case of multiple allocations, select which one to charge
##PBS -A xyz

# NOTE: lines that begin with "#PBS" are not interpreted by the shell but
ARE
# used by the batch system, wheras lines that begin with multiple # signs,
# like "##PBS" are considered "commented out" by the batch system
# and have no effect.

# If you launched the job in a directory prepared for the job to run
within,
# you'll want to cd to that directory
# [uncomment the following line to enable this]
cd $PBS_O_WORKDIR

# Alternatively, the job script can create its own job-ID-unique directory
# to run within. In that case you'll need to create and populate that
# directory with executables and perhaps inputs
# [uncomment and customize the following lines to enable this behavior]
# mkdir -p /scratch/sciteam/$USER/$PBS_JOBID
# cd /scratch/sciteam/$USER/$PBS_JOBID
# cp /scratch/job/setup/directory/* .

# To add certain modules that you do not have added via ~/.modules
. /opt/modules/default/init/bash # NEEDED to add module commands to shell
#module swap PrgEnv-cray PrgEnv-gnu
module add craype-hugepages8M
module add rca

#export CRAY_ROOTFS=DSL
echo $LD_LIBRARY_PATH

#export APRUN_XFER_LIMITS=1 # to transfer shell limits to the executable

### launch the application
### redirecting stdin and stdout if needed
### NOTE: (the "in" file must exist for input)

# used for timing
date

aprun -n2 -N1 ./partopsimapp
../../../tests/data/input/config/plugins_simp_parebepcg_jacobi_brick.lua
../../../tests/data/input/examples/CantSymm/CantSymm12_2.pos
../../../tests/data/output/CantSymm12_2_result.pos

# used for timing
date
### For more information see the man page for aprun







Archive powered by MHonArc 2.6.16.

Top of Page