Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] icc compiler option

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] icc compiler option


Chronological Thread 
  • From: Jim Phillips <jim AT ks.uiuc.edu>
  • To: "Bennion, Brian" <Bennion1 AT llnl.gov>
  • Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: Re: [charm] [ppl] icc compiler option
  • Date: Tue, 6 Sep 2011 18:24:25 -0500 (CDT)
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>


It's starting but finding some reason to crash. Maybe just try MPI.

-Jim


On Tue, 6 Sep 2011, Bennion, Brian wrote:

ldd showed all libraries were linked.

++verbose was well...verbose the output is below.
bennion1
237:~/g11Dir/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/tests>
../charmrun +p 12 ++verbose ++mpiexec ++remote-shell mympiexec
/g/g14/bennion1/g11Dir/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/namd2
/g/g14/bennion1/g11Dir/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/test/apoa1.namd
Charmrun> charmrun started...
Charmrun> adding client 0: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 1: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 2: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 3: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 4: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 5: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 6: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 7: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 8: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 9: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 10: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 11: "127.0.0.1", IP:127.0.0.1
Charmrun> Charmrun = 192.168.117.53, port = 56501
Charmrun> IBVERBS version of charmrun
Charmrun> Sending "$CmiMyNode 192.168.117.53 56501 387 0" to client 0.
Charmrun> find the node program
"/g/g14/bennion1/g11Dir/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/namd2" at
"/g/g11/petefred/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/tests" for 0.
Charmrun> Starting mympiexec ./charmrun.387
Charmrun> mpiexec started
Charmrun> node programs all started
Charmrun> Waiting for 0-th client to connect.
This is dollar star -n 12 ./charmrun.387
srun: Job is in held state, pending scheduler release
srun: job 1000783 queued and waiting for resources
srun: job 1000783 has been allocated resources
Charmrun remote shell(127.0.0.1.0)> remote responding...
Charmrun remote shell(127.0.0.1.0)> remote responding...
Charmrun remote shell(127.0.0.1.0)> remote responding...
Charmrun remote shell(127.0.0.1.0)> remote responding...
Charmrun remote shell(127.0.0.1.0)> remote responding...
Charmrun remote shell(127.0.0.1.0)> remote responding...
Charmrun remote shell(127.0.0.1.0)> remote responding...
Charmrun remote shell(127.0.0.1.0)> remote responding...
Charmrun remote shell(127.0.0.1.0)> remote responding...
Charmrun remote shell(127.0.0.1.0)> remote responding...
Charmrun remote shell(127.0.0.1.0)> remote responding...
Charmrun remote shell(127.0.0.1.0)> remote responding...
Charmrun remote shell(127.0.0.1.0)> starting node-program...
Charmrun remote shell(127.0.0.1.0)> starting node-program...
Charmrun remote shell(127.0.0.1.0)> starting node-program...
Charmrun remote shell(127.0.0.1.0)> starting node-program...
Charmrun remote shell(127.0.0.1.0)> starting node-program...
Charmrun remote shell(127.0.0.1.0)> starting node-program...
Charmrun remote shell(127.0.0.1.0)> starting node-program...
Charmrun remote shell(127.0.0.1.0)> starting node-program...
Charmrun remote shell(127.0.0.1.0)> starting node-program...
Charmrun remote shell(127.0.0.1.0)> starting node-program...
Charmrun remote shell(127.0.0.1.0)> starting node-program...
Charmrun remote shell(127.0.0.1.0)> starting node-program...
Charmrun> Waiting for 1-th client to connect.
Charmrun> Waiting for 2-th client to connect.
Charmrun> Waiting for 3-th client to connect.
Charmrun> Waiting for 4-th client to connect.
Charmrun> Waiting for 5-th client to connect.
Charmrun> Waiting for 6-th client to connect.
Charmrun> Waiting for 7-th client to connect.
Charmrun> Waiting for 8-th client to connect.
Charmrun> Waiting for 9-th client to connect.
Charmrun> Waiting for 10-th client to connect.
Charmrun> Waiting for 11-th client to connect.
Charmrun> All clients connected.
Charmrun> IP tables sent.
Charmrun> node programs all connected
Charmrun> started all node programs in 39.031 seconds.
Charmrun: error on request socket--
Socket closed before recv.
bennion1 238:~/g11Dir/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/tests>
Charmrun remote shell(127.0.0.1.0)> rsh phase successful.
Charmrun remote shell(127.0.0.1.0)> rsh phase successful.
Charmrun remote shell(127.0.0.1.0)> rsh phase successful.
Charmrun remote shell(127.0.0.1.0)> rsh phase successful.
Charmrun remote shell(127.0.0.1.0)> rsh phase successful.
Charmrun remote shell(127.0.0.1.0)> rsh phase successful.
Charmrun remote shell(127.0.0.1.0)> rsh phase successful.
Charmrun remote shell(127.0.0.1.0)> rsh phase successful.
Charmrun remote shell(127.0.0.1.0)> rsh phase successful.
Charmrun remote shell(127.0.0.1.0)> rsh phase successful.
Charmrun remote shell(127.0.0.1.0)> rsh phase successful.
Charmrun remote shell(127.0.0.1.0)> rsh phase successful.

________________________________________
From: Jim Phillips
[jim AT ks.uiuc.edu]
Sent: Tuesday, September 06, 2011 3:52 PM
To: Bennion, Brian
Cc:
charm AT cs.uiuc.edu
Subject: RE: [ppl] [charm] icc compiler option

No charmrun_err files? Anything missing when you run ldd?

Try adding ++verbose.

-Jim


On Tue, 6 Sep 2011, Bennion, Brian wrote:

OK. should have seen that one.
I get a little farther now....

../charmrun +p 12 ++mpiexec ++remote-shell mympiexec
/g/g14/bennion1/g11Dir/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/namd2

/g/g14/bennion1/g11Dir/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/test/apoa1.namd
Charmrun> IBVERBS version of charmrun
This is dollar star -n 12 ./charmrun.32109
srun: Job is in held state, pending scheduler release
srun: job 1000777 queued and waiting for resources
srun: job 1000777 has been allocated resources
Charmrun> started all node programs in 7.786 seconds.
Charmrun: error on request socket--
Socket closed before recv.


________________________________________
From: Jim Phillips
[jim AT ks.uiuc.edu]
Sent: Tuesday, September 06, 2011 3:28 PM
To: Bennion, Brian
Cc:
charm AT cs.uiuc.edu
Subject: RE: [ppl] [charm] icc compiler option

You probably don't what the "shift; shift;" at the beginning, since it's
stripping off the "-n 12" arguments that I assume srun needs.

-Jim


On Tue, 6 Sep 2011, Bennion, Brian wrote:

Hello Jim,

OK this seems like its going to be painful.
mpiexec doesn't exist on sierra.llnl.gov

mympiexec file is below
#!/bin/csh
echo "This is dollar start "$*
shift; shift; exec srun $* -p pdebug

when the command below is executed

../charmrun +p 12 ++mpiexec ++remote-shell mympiexec ../namd2 apoa1.namd

The output is

Charmrun> IBVERBS version of charmrun
This is dollar star: -n 12 ./charmrun.24175
srun: Job is in held state, pending scheduler release
srun: job 1000773 queued and waiting for resources
Charmrun> error 0 attaching to node:
Timeout waiting for node-program to connect

The charmrun.24175 file is script that seems to be hanging somewhere. Its
contents are pasted below:


#!/bin/sh
Echo() {
echo 'Charmrun remote shell(127.0.0.1.0)>' $*
}
Exit() {
if [ $1 -ne 0 ]
then
Echo Exiting with error code $1
fi
exit $1
}
Find() {
loc=''
for dir in `echo $PATH | sed -e 's/:/ /g'`
do
test -f "$dir/$1" && loc="$dir/$1"
done
if [ "x$loc" = x ]
then
Echo $1 not found in your PATH "($PATH)"--
Echo set your path in your ~/.charmrunrc
Exit 1
fi
}
test -f "$HOME/.charmrunrc" && . "$HOME/.charmrunrc"
DISPLAY='sierra0:16.0';export DISPLAY
NETMAGIC="24175";export NETMAGIC
CmiMyNode=$OMPI_COMM_WORLD_RANK
test -z "$CmiMyNode" && CmiMyNode=$MPIRUN_RANK
test -z "$CmiMyNode" && CmiMyNode=$PMI_RANK
test -z "$CmiMyNode" && CmiMyNode=$PMI_ID
test -z "$CmiMyNode" && (Echo Could not detect rank from environment ; Exit 1)
export CmiMyNode
NETSTART="$CmiMyNode 192.168.112.1 48334 24175 0";export NETSTART
CmiMyNodeSize='1'; export CmiMyNodeSize
CmiMyForks='0'; export CmiMyForks
CmiNumNodes=$OMPI_COMM_WORLD_SIZE
test -z "$CmiNumNodes" && CmiNumNodes=$MPIRUN_NPROCS
test -z "$CmiNumNodes" && CmiNumNodes=$PMI_SIZE
test -z "$CmiNumNodes" && (Echo Could not detect node count from environment
; Exit 1)
export CmiNumNodes
PATH="$PATH:/bin:/usr/bin:/usr/X/bin:/usr/X11/bin:/usr/local/bin:/usr/X11R6/bin:/usr/openwin/bin"
if test ! -x
"/g/g11/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/tests/../namd2"
then
Echo 'Cannot locate this node-program:
/g/g11/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/tests/../namd2'
Exit 1
fi
cd "/g/g11/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/tests"
if test $? = 1
then
Echo 'Cannot propagate this current directory:'
Echo '/g/g11/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/tests'
Exit 1
fi
rm -f /tmp/charmrun_err.$$
("/g/g11/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/tests/../namd2"
apoa1.namdq
res=$?
if [ $res -eq 127 ]
then
(
"/g/g11/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/tests/../namd2"
ldd
"/g/g11/NAMD_CVS-2011-08-29_Source/Linux-x86_64-ib-icc-sse3/tests/../namd2"
) > /tmp/charmrun_err.$$ 2>&1
fi
) < /dev/null 1> /dev/null 2> /dev/null
sleep 1
if [ -r /tmp/charmrun_err.$$ ]
then
cat /tmp/charmrun_err.$$
rm -f /tmp/charmrun_err.$$
Exit 1
fi
Exit 0

From: Jim Phillips
[jim AT ks.uiuc.edu]
Sent: Tuesday, September 06, 2011 2:22 PM
To: Bennion, Brian
Cc:
charm AT cs.uiuc.edu
Subject: RE: [ppl] [charm] icc compiler option

You use charmrun with the ++mpiexec option and a mympiexec script that
runs your srun with whatever options it needs. See this bit in notes.txt:

-- Linux Clusters with InfiniBand or Other High-Performance Networks --

Charm++ provides a special ibverbs network layer that uses InfiniBand
networks directly through the OpenFabrics OFED ibverbs library. This
avoids efficiency and portability issues associated with MPI. Look for
pre-built ibverbs NAMD binaries or specify ibverbs when building Charm++.

Writing batch job scripts to run charmrun in a queueing system can be
challenging. Since most clusters provide directions for using mpiexec
to launch MPI jobs, charmrun provides a ++mpiexec option to use mpiexec
to launch non-MPI binaries. If "mpiexec -np <procs> ..." is not
sufficient to launch jobs on your cluster you will need to write an
executable mympiexec script like the following from TACC:

#!/bin/csh
shift; shift; exec ibrun $*

The job is then launched (with full paths where needed) as:

charmrun +p<procs> ++mpiexec ++remote-shell mympiexec namd2 <configfile>


-Jim


On Tue, 6 Sep 2011, Bennion, Brian wrote:

Thanks for the patch. It is a bit more sophisticated that what I would have
come up with.

How does one start an ibverbs namd2.8 executable when more than 1 node is needed. With
mpi builds I just tell our "srun" scheduler that I need 144 tasks and it
assigns the PEs appropriately.

The same syntax only produced 144 separate but identical jobs.

Brian

________________________________________
From: Jim Phillips
[jim AT ks.uiuc.edu]
Sent: Tuesday, September 06, 2011 5:36 AM
To: Bennion, Brian
Cc:
charm AT cs.uiuc.edu
Subject: RE: [ppl] [charm] icc compiler option

Hi,

The attached patch should fix this.

-Jim


On Mon, 5 Sep 2011, Bennion, Brian wrote:

bennion1 35:~> icc -v
icc version 12.1.0 (gcc version 4.1.2 compatibility)
ld /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/crt1.o
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/crti.o
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/crtbegin.o --eh-frame-hdr
-dynamic-linker /lib64/ld-linux-x86-64.so.2
-L/usr/local/tools/ifort-12.1.023-beta/lib -o a.out
-L/usr/local/tools/icc-12.1.023-beta/compiler/lib/intel64
-L/usr/lib/gcc/x86_64-redhat-linux/4.1.2
-L/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64
-L/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../.. -L/lib64 -L/lib
-L/usr/lib64 -L/usr/lib -rpath /usr/local/tools/icc-12.1.023-beta/lib -rpath
/usr/local/tools/ifort-12.1.023-beta/lib -Bstatic -limf -lsvml -Bdynamic -lm
-Bstatic -lipgo -ldecimal --as-needed -Bdynamic -lcilkrts -lstdc++
--no-as-needed -lgcc -lgcc_s -Bstatic -lirc -Bdynamic -lc -lgcc -lgcc_s
-Bstatic -lirc_s -Bdynamic -ldl -lc
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/crtend.o
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/crtn.o
/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64/crt1.o: In function
`_start':
(.text+0x20): undefined reference to `main'

-----Original Message-----
From: Jim Phillips
[mailto:jim AT ks.uiuc.edu]
Sent: Saturday, September 03, 2011 1:32 PM
To: Bennion, Brian
Cc:
charm AT cs.uiuc.edu
Subject: Re: [ppl] [charm] icc compiler option


Actually, the Intel 12.x compilers use Version too:

[jphillip@kidlogin2
~]$ icc -v
Version 12.0.4

Brian, what does icc -v return for you? On what platform?

-Jim


On Sat, 3 Sep 2011, Jim Phillips wrote:


This must be new in the Intel 12.x compilers.

What does "icc -v" look like for you?

-Jim


On Fri, 2 Sep 2011, Bennion, Brian wrote:




Hello,

In the charm632 version that I ships with namd2.8 there is a small bug in
the build scripts. Specifically, if icc is requested as the compiler, the
build will fail because icc was not found. In the cc-icc.sh script the
grep command is looking for "Version" all icc -v commands only show
"version".
The grep comand should have the "-i" option to check for both spellings.

Brian

_______________________________________________
charm mailing list
charm AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/charm
_______________________________________________
ppl mailing list
ppl AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/ppl











Archive powered by MHonArc 2.6.16.

Top of Page