Skip to Content.
Sympa Menu

charm - [charm] Running NAMD 2.9 with CUDA stalls when built on CharmArch gemini_gni-crayxe-persistent-smp with largepages

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] Running NAMD 2.9 with CUDA stalls when built on CharmArch gemini_gni-crayxe-persistent-smp with largepages


Chronological Thread 
  • From: "Lai, Jonathan" <jlai7 AT illinois.edu>
  • To: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: [charm] Running NAMD 2.9 with CUDA stalls when built on CharmArch gemini_gni-crayxe-persistent-smp with largepages
  • Date: Thu, 6 Sep 2012 17:22:24 +0000
  • Accept-language: en-US
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Dear PPL,

I am currently trying to run NAMD 2.9 with CUDA on a Cray-XE machine (titan Dev).  If I run my calculation on 5 nodes, the calculation completes; however, if I increase my job to 10 nodes; then the calculation stalls par tof the way through without any error message; I do not know if this is an error with NAMD or charm++ which why I am emailing.

I have built charm++ using the following commands:
1) module load craype-hugepages8M
2) Change #define LARGEPAGE 0 to #define LARGEPAGE 1
3) ./build charm++ gemini_gni-crayxe smp persistent --no-build-shared --with-production

and running NAMD 2.9 without CUDA against this charm++ does not generate any of the above problems.  I only get the stalling error when running the CUDA version of NAMD.  I have not encountered this problem when running with LARGEPAGE turned off.  Any thoughts?

Cheers,
Jonathan Lai
jlai7 AT illinois.edu


  • [charm] Running NAMD 2.9 with CUDA stalls when built on CharmArch gemini_gni-crayxe-persistent-smp with largepages, Lai, Jonathan, 09/06/2012

Archive powered by MHonArc 2.6.16.

Top of Page