Skip to Content.
Sympa Menu

charm - [charm] charm++ + namd: fail to run 128 core job

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] charm++ + namd: fail to run 128 core job


Chronological Thread 
  • From: "Sangamesh B" <forum.san AT gmail.com>
  • To: "Charm ML" <charm AT cs.uiuc.edu>
  • Cc: Charm++ M L <ppl AT cs.uiuc.edu>
  • Subject: [charm] charm++ + namd: fail to run 128 core job
  • Date: Wed, 13 Aug 2008 12:33:31 +0530
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Hi all,

I've built charm++ 6.0 version with mvapich2 library , intel compilers
on Rocks 4.3, 33 node cluster ( Dual processor, Quad core Intel Xeon: Total
264 cores ).

NAMD is also built as Linux-mvapich2.

The scaling is good from: 8 to 16, 16 to 32, 32 to 64. But when 128 core job
is submitted, the job fails.

#mpirun -machinefile ./machfile -np 128
/data/apps/namd26_mvapich2/Linux-mvapich2/namd2 ./apoa1.namd | tee
namd_128cores
Charm++> Running on MPI version: 2.0 multi-thread support: 0/0
rank 65 in job 4 master_host_name_50238 caused collective abort of all
ranks
exit status of rank 65: killed by signal 9

The input file is the standard benchmark file which is available on the NAMD
website, i.e. apoa1.tar.gz.

According to the benchmark results given on the site, say that it
runs/scales upto 256 processors.

But in my case, its even not running for 128 cores.

But other applications such as Amber 9 and Gromacs work for upto 256
processors. Means there is no problem with mvapich2.

So, what went wrong?

Thanks,
Sangamesh



  • [charm] charm++ + namd: fail to run 128 core job, Sangamesh B, 08/13/2008

Archive powered by MHonArc 2.6.16.

Top of Page