Charm++ parallel programming system

Text archives Help


Re: [charm] [ppl] Charm install Error


Chronological Thread 
  • From: Jim Phillips <jim AT ks.uiuc.edu>
  • To: Abhishek TYAGI <atyagiaa AT connect.ust.hk>
  • Cc: "charm AT cs.illinois.edu" <charm AT cs.illinois.edu>
  • Subject: Re: [charm] [ppl] Charm install Error
  • Date: Sat, 23 Jan 2016 15:21:39 -0600 (CST)


You can use this binary:
http://www.ks.uiuc.edu/~jim/tmp/NAMD_2.11_Linux-x86_64-netlrts-CUDA.tar.gz

You will need to use the new "+devicesperreplica" option so that each replica only uses one device and they do not all use the same device.

I suggest the following options (assuming you want 16 replicas):

./charmrun ++local +p16 ./namd2 +devicesperreplica 1 +replicas 16 +idlepoll
+pemap 0-15

Jim


On Sat, 23 Jan 2016, Abhishek TYAGI wrote:

Dear Jim,

We have infiband, but the queue for that is too long to get job executed,
therefore, I have no choice other that to use the the interactive mode. The
culster information is here
(https://itsc.ust.hk/services/academic-teaching-support/high-performance-computing/gpu-cluster/access-the-cluster/)

The list of nodes for interactive access is as follows.
Type: Interactive
Num of Nodes: 1
CPU: 2 x Intel E5-2650 (8-core
Memory: 64GB DDR3-133
Coprocessor: 4 x Nvidia Tesla M2090

I had tried to run the REUS using the same with 16 windows it shows 10
ns/days, only on CPU's. Its too slow. I tried many thing, but I am not able
to connect the GPU with the CPU. I tried other versions but not able to do
run it.

Thanks

Abi


Abhishek Tyagi
PhD Student
Chemical and Biomolecular Engineering
Hong Kong University of Science and Technology
Clear Water Bay, Hong Kong


________________________________________
From: Jim Phillips
<jim AT ks.uiuc.edu>
Sent: Sunday, January 24, 2016 3:12 AM
To: Abhishek TYAGI
Cc:
charm AT cs.illinois.edu
Subject: Re: [ppl] [charm] Charm install Error

If your cluster has InfiniBand you can use Linux-x86_64-verbs-smp-CUDA.

If you only have ethernet then I'll need to add a new build to the NAMD
download list.

Jim

On Sat, 23 Jan 2016, Abhishek TYAGI wrote:

Dear Jim,

I am running this version sucessfully on GPU Cluster, however I am not able
to execute binaries with Linux-x86_64-netlrts (Multi-copy algorithms). This
can only run on CPU version. I am not able to add GPU in this. Can you
suggest how to make it work.
The gpu cluster : 4 GPU K20 and 2 E series CPU 32 cores.

Abhishek Tyagi
PhD Student
Chemical and Biomolecular Engineering
Hong Kong University of Science and Technology
Clear Water Bay, Hong Kong


________________________________________
From: Jim Phillips
<jim AT ks.uiuc.edu>
Sent: Tuesday, December 22, 2015 12:19 PM
To: Abhishek TYAGI
Cc:
charm AT cs.illinois.edu
Subject: Re: [ppl] [charm] Charm install Error

You would run NAMD using charmrun. Directions are in the release notes.

Jim

On Tue, 22 Dec 2015, Abhishek TYAGI wrote:

Dear Jim,

Thankyou for the reply. For the updated version, do you have idea how the
tutorial will work, I mean the tutorial is for mpirun.

Thanks and regards
Abhi



________________________________________
From: Jim Phillips
<jim AT ks.uiuc.edu>
Sent: Tuesday, December 22, 2015 12:17 AM
To: Abhishek TYAGI
Cc:
charm AT cs.illinois.edu
Subject: Re: [ppl] [charm] Charm install Error

The REUS feature is now available with verbs-linux-x86_64 and
verbs-linux-x86_64-smp (for CUDA) Charm++ builds, so you should be able to
use the pre-built binaries that are available for download (the ones that
mention "multi-copy algorithms" in the comment).

Wait a few hours and you can get the 2.11 release.

Jim


On Mon, 21 Dec 2015, Abhishek TYAGI wrote:

Dear Jim,

For the NAMD-MPI compiling I am using NAMD-CSV source code. I want to compile MPI
version. The reason for compiling NAMD with MPI is to run "A Tutorial on
One-dimensional Replica-exchange Umbrella Sampling". According to this tutorial
mpirun will be implimented to execute REUS. Therefore, I am compiling this version. The
GPU cluster have the configuration to use infiband.
(https://itsc.ust.hk/services/academic-teaching-support/high-performance-computing/gpu-cluster/hardware/)

The command I had tried is as follows:

[atyagiaa AT login-0
charm-6.7.0]$ env MPICXX=mpicxx ./build charm++ mpi-linux-x86_64
--with-production

[atyagiaa AT login-0
charm-6.7.0]$ which mpicxx
/usr/local/pgi/linux86-64/2013/mpi2/mpich/bin/mpicxx.

The details of compilation of charm were attached herewith the mail.

(For general molecular simulations, I am using NAMD 2.11b1 multicore-CUDA
build.)

Thanks and regards
Abhi


________________________________________
From: Jim Phillips
<jim AT ks.uiuc.edu>
Sent: Monday, December 21, 2015 10:02 PM
To: Abhishek TYAGI
Cc:
charm AT cs.illinois.edu
Subject: Re: [ppl] [charm] Charm install Error

Also, you should be using verbs-smp or net-ibverbs-smp on a GPU cluster.

Jim


On Mon, 21 Dec 2015, Phil Miller wrote:

Hi Abhi,

Could you please provide the following extra information:
- what underlying compiler does your mpicxx call?
- what was the output of the configure script at the beginning of the build
process?

With that, hopefully we can help resolve this.

Phil

On Mon, Dec 21, 2015 at 4:08 AM, Abhishek TYAGI
<atyagiaa AT connect.ust.hk>
wrote:

Hi,


I am trying to install charm-6.7.0 on GPU cluster here in our university.
But I got the following error, I tried to change the input information,
still the same error is coming, I am trying to compile NAMD for MPI on the
cluster. For your information, mpirun and mpiexec were already installed on
the cluster. As a user I have limited access. The error is as


../../../../bin/charmc -O -DCMK_OPTIMIZE=1 -o
../../../../lib/libmoduleCkMulticast.a ckmulticast.o
ar: creating ../../../../lib/libmoduleCkMulticast.a
/bin/cp CkMulticast.decl.h ../../../../include
/bin/cp CkMulticast.def.h ../../../../include
gmake[1]: Leaving directory
`/d4/atyagiaa/NAMD_CVS-2015-12-17_Source/charm-6.7.0/mpi-linux-x86_64/tmp/libs/ck-libs/multicast'
/usr/bin/gmake -C libs/ck-libs/pythonCCS
gmake[1]: Entering directory
`/d4/atyagiaa/NAMD_CVS-2015-12-17_Source/charm-6.7.0/mpi-linux-x86_64/tmp/libs/ck-libs/pythonCCS'
(CHARMINC=../../../../include;. $CHARMINC/conv-config.sh; \
if test "$CMK_BUILD_PYTHON" != ""; then (/usr/bin/gmake conditional
OPTS='-O -DCMK_OPTIMIZE=1 ' || exit 1); fi)
gmake[2]: Entering directory
`/d4/atyagiaa/NAMD_CVS-2015-12-17_Source/charm-6.7.0/mpi-linux-x86_64/tmp/libs/ck-libs/pythonCCS'
gmake[2]: Nothing to be done for `conditional'.
gmake[2]: Leaving directory
`/d4/atyagiaa/NAMD_CVS-2015-12-17_Source/charm-6.7.0/mpi-linux-x86_64/tmp/libs/ck-libs/pythonCCS'
gmake[1]: Leaving directory
`/d4/atyagiaa/NAMD_CVS-2015-12-17_Source/charm-6.7.0/mpi-linux-x86_64/tmp/libs/ck-libs/pythonCCS'
/usr/bin/gmake -C libs/ck-libs/io
gmake[1]: Entering directory
`/d4/atyagiaa/NAMD_CVS-2015-12-17_Source/charm-6.7.0/mpi-linux-x86_64/tmp/libs/ck-libs/io'
gmake[1]: Nothing to be done for `all'.
gmake[1]: Leaving directory
`/d4/atyagiaa/NAMD_CVS-2015-12-17_Source/charm-6.7.0/mpi-linux-x86_64/tmp/libs/ck-libs/io'
/usr/bin/gmake -C libs/ck-libs/ckloop
gmake[1]: Entering directory
`/d4/atyagiaa/NAMD_CVS-2015-12-17_Source/charm-6.7.0/mpi-linux-x86_64/tmp/libs/ck-libs/ckloop'
../../../../bin/charmc -O -DCMK_OPTIMIZE=1 -lpthread -I../../../../tmp -o
CkLoop.o CkLoop.C
"CkLoop.h", line 105: error: identifier "__sync_add_and_fetch" is undefined
return __sync_add_and_fetch(&curChunkIdx, 1);
^

"CkLoop.h", line 119: error: identifier "__sync_add_and_fetch" is undefined
__sync_add_and_fetch(&finishFlag, counter);
^

"CkLoop.h", line 264: warning: statement is unreachable
return NULL;
^

"CkLoop.C", line 18: error: identifier "__thread" is undefined
static __thread pthread_cond_t thdCondition; //the signal var of each
pthread to be notified
^

"CkLoop.C", line 18: error: "pthread_cond_t" has already been declared in
the
current scope
static __thread pthread_cond_t thdCondition; //the signal var of each
pthread to be notified
^

"CkLoop.C", line 18: error: expected a ";"
static __thread pthread_cond_t thdCondition; //the signal var of each
pthread to be notified
^

"CkLoop.C", line 19: error: identifier "__thread" is undefined
static __thread pthread_mutex_t thdLock; //the lock associated with the
condition variables
^

"CkLoop.C", line 19: error: "pthread_mutex_t" has already been declared in
the
current scope
static __thread pthread_mutex_t thdLock; //the lock associated with the
condition variables
^

"CkLoop.C", line 19: error: expected a ";"
static __thread pthread_mutex_t thdLock; //the lock associated with the
condition variables
^

"CkLoop.C", line 27: error: variable "pthread_mutex_t" is not a type name
static pthread_mutex_t **allLocks = NULL;
^

"CkLoop.C", line 28: error: variable "pthread_cond_t" is not a type name
static pthread_cond_t **allConds = NULL;
^

"CkLoop.C", line 64: error: identifier "thdLock" is undefined
pthread_mutex_init(&thdLock, NULL);
^

"CkLoop.C", line 65: error: identifier "thdCondition" is undefined
pthread_cond_init(&thdCondition, NULL);
^

"CkLoop.C", line 70: error: identifier "__sync_add_and_fetch" is undefined
__sync_add_and_fetch(&gCrtCnt, 1);
^

"CkLoop.C", line 93: warning: missing return statement at end of non-void
function "ndhThreadWork"
}
^

"CkLoop.C", line 98: error: expected an expression
allLocks = (pthread_mutex_t **)malloc(sizeof(void *)*numThreads);
^

"CkLoop.C", line 98: error: expected a ";"
allLocks = (pthread_mutex_t **)malloc(sizeof(void *)*numThreads);
^

"CkLoop.C", line 99: error: expected an expression
allConds = (pthread_cond_t **)malloc(sizeof(void *)*numThreads);
^

"CkLoop.C", line 99: error: expected a ";"
allConds = (pthread_cond_t **)malloc(sizeof(void *)*numThreads);
^

"CkLoop.def.h", line 88: warning: variable "newmsg" was declared but never
referenced
CharmNotifyMsg *newmsg = (CharmNotifyMsg *)this;
^

"CkLoop.def.h", line 123: warning: variable "newmsg" was declared but never
referenced
HelperNotifyMsg *newmsg = (HelperNotifyMsg *)this;
^

"CkLoop.def.h", line 158: warning: variable "newmsg" was declared but never
referenced
DestroyNotifyMsg *newmsg = (DestroyNotifyMsg *)this;
^

"CkLoop.C", line 38: warning: function "HelperOnCore" was declared but
never
referenced
static int HelperOnCore() {
^

17 errors detected in the compilation of "CkLoop.C".
Fatal Error by charmc in directory
/d4/atyagiaa/NAMD_CVS-2015-12-17_Source/charm-6.7.0/mpi-linux-x86_64/tmp/libs/ck-libs/ckloop
Command mpicxx -DCMK_GFORTRAN -I../../../../bin/../include
-D__CHARMC__=1 -DCMK_OPTIMIZE=1 -I../../../../tmp -O -c CkLoop.C -o
CkLoop.o returned error code 2
charmc exiting...
gmake[1]: *** [CkLoop.o] Error 1
gmake[1]: Leaving directory
`/d4/atyagiaa/NAMD_CVS-2015-12-17_Source/charm-6.7.0/mpi-linux-x86_64/tmp/libs/ck-libs/ckloop'
gmake: *** [ckloop] Error 2

Please suggest me what to do?


Thanks in advance

Abhi


Abhishek Tyagi

PhD Student

Chemical and Biomolecular Engineering

Hong Kong University of Science and Technology

Clear Water Bay, Hong Kong










Archive powered by MHonArc 2.6.16.

Top of page