Skip to Content.
Sympa Menu

charm - Re: [charm] charm++ warning messages

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] charm++ warning messages


Chronological Thread 
  • From: Phil Miller <mille121 AT illinois.edu>
  • To: Bo Zhang <zhang416 AT indiana.edu>
  • Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: Re: [charm] charm++ warning messages
  • Date: Wed, 2 Oct 2013 12:25:01 -0700
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

It's indeed odd that you're seeing that message in one case but not the other. I'm not sure why that is, but it should appear in both cases. Either way, every process that gets launched contains both a worker thread and a communication thread, exactly oversubscribing the cores by a factor of 2.

Regardless of that though, if you're intending to take advantage of shared-memory communication on-node (the 'smp' build option), you'll need to reduce the number of processes that aprun launches, and tell the Charm++ runtime system to launch multiple worker threads per process using the "+ppn N" option. To avoid oversubscription, you'll want to pass an argument that's one less than the number of cores available for that process. For instance, if you launch one process per node on the XE6 you're using, you'd want +ppn 31 to leave a core available for the communication thread. If you launch two processes per node, then you'll want +ppn 15, since each process will have a communication thread that will want its own core.


On Wed, Oct 2, 2013 at 12:03 PM, Bo Zhang <zhang416 AT indiana.edu> wrote:
Hello charm team,

I have been running a charm program on a cray xe6 machine with different configurations. Some of them produce a warning message, and I would like to know the cause for that.

Script for job1:

#! /bin/bash
#PBS -l nodes=2:ppn=32
#PBS -l walltime=24:00:00
#PBS -q cpu
#PBS -N lulesh_charm
#PBS -V

cd $PBS_O_WORKDIR
aprun -n 64 ./lulesh 128 32

Output for job1:
zhang416@login1:~/work/lulesh-charm> cat *.o91471
Charm++> Running on Gemini (GNI) with 64 processes
Charm++> static SMSG
Charm++> SMSG memory: 316.0KB
Charm++> memory pool init block size: 8MB, total memory pool limit 0MB (0 means no limit)
Charm++> memory pool registered memory limit: 200000MB, send limit: 100000MB
Charm++> only comm thread send/recv messages
Charm++> Cray TLB page size: 8192K
Charm++> Running in SMP mode: numNodes 64,  1 worker threads per process
Charm++> The comm. thread both sends and receives messages
Converse/Charm++ Commit ID:
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 2 unique compute nodes (32-way SMP).

Charm++> Warning: the number of SMP threads (64) is greater than the number of physical cores (32), so threads will sleep while idling. Use +CmiSpinOnIdle or +CmiSleepOnIdle to control this directly.

Script for job2:
#! /bin/bash
#PBS -l nodes=16:ppn=32
#PBS -l walltime=24:00:00
#PBS -q cpu
#PBS -N lulesh_charm
#PBS -V

cd $PBS_O_WORKDIR
aprun -n 512 ./lulesh 256 32

Output for job2:
zhang416@login1:~/work/lulesh-charm/old-log> cat *.o91481
Charm++> Running on Gemini (GNI) with 512 processes
Charm++> static SMSG
Charm++> SMSG memory: 2528.0KB
Charm++> memory pool init block size: 8MB, total memory pool limit 0MB (0 means no limit)
Charm++> memory pool registered memory limit: 200000MB, send limit: 100000MB
Charm++> only comm thread send/recv messages
Charm++> Cray TLB page size: 8192K
Charm++> Running in SMP mode: numNodes 512,  1 worker threads per process
Charm++> The comm. thread both sends and receives messages
Converse/Charm++ Commit ID:
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 16 unique compute nodes (32-way SMP).

I am wondering why I get the warning message in the first case but not the second case.

I am using charm-6.5.1 and build it with

./build charm++ gemini_gni-crayxe   smp   persistent hugepages   -j16  --with-production

I have rca and craype-hugepages8M  modules loaded.

Thanks,

Bo


_______________________________________________
charm mailing list
charm AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/charm




Archive powered by MHonArc 2.6.16.

Top of Page