charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] Scalability issues using large chare array

From: Steve Petruzza <spetruzza AT sci.utah.edu>
To: charm <charm AT lists.cs.illinois.edu>
Subject: [charm] Scalability issues using large chare array
Date: Mon, 1 Aug 2016 15:44:38 +0300

Hi all,

In my application I have a single chare array in the main chare that creates thousands of chare tasks that eventually will execute some tasks and communicate between them (not all simultaneously).

If I run on 1024 cores I get the following at the startup:

Charm++> Running on Gemini (GNI) with 1024 processes
Charm++> static SMSG
Charm++> SMSG memory: 5056.0KB
Charm++> memory pool init block size: 8MB, total memory pool limit 0MB (0 means no limit)
Charm++> memory pool registered memory limit: 200000MB, send limit: 100000MB
Charm++> only comm thread send/recv messages
Charm++> Cray TLB page size: 2048K
Charm++> Running in SMP mode: numNodes 1024, 1 worker threads per process
Charm++> The comm. thread both sends and receives messages
Converse/Charm++ Commit ID: v6.7.0-281-g8d5cdd9
Warning> using Isomalloc in SMP mode, you may need to run with '+isomalloc_sync'.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 64 unique compute nodes (16-way SMP).

Charm++> Warning: the number of SMP threads (32) is greater than the number of physical cores (16), so threads will sleep while idling. Use +\
CmiSpinOnIdle or +CmiSleepOnIdle to control this directly.

WARNING: +p1024 is a command line argument beginning with a '+' but was not parsed by the RTS.
If any of the above arguments were intended for the RTS you may need to recompile Charm++ with different options.
…

I’m running using:

aprun -n 1024 -N 16 ./charm_app +p1024

and charm is built as: ./build charm++ gni-crayxe smp -j16 --with-production

If I add the +ppn16 (or 15 or less) to the charm_app the number of SMP threads multiply by that factor, so I don’t know how to remove that Warning (the number of SMP…).

By the way if I run some stats I see something like the following:

Charm Kernel Summary Statistics:
Proc 0: [11 created, 11 processed]
Proc 1: [0 created, 0 processed]
Proc 2: [0 created, 0 processed]
Proc 3: [0 created, 0 processed]

… all the others 0,0

Charm Kernel Detailed Statistics (R=requested P=processed):

Create   Mesgs     Create   Mesgs     Create   Mesgs
Chare     for     Group     for     Nodegroup for
PE   R/P Mesgs     Chares   Mesgs     Groups   Mesgs     Nodegroups
---- --- --------- --------- --------- --------- --------- ----------
0  R     11     0   14     1     8   1024
  P     11   7732   14     2     8     0
1  R   0     0     0     1     0     0
  P   0     0   14     2     0     1
2  R   0     0     0     2     0     0
  P   0     0   14     3     0     0
3  R   0     0     0     2     0     0
  P   0     0   14     3     0     0

… all the others like PE 1,2,3…

Is the chare 0 processing all the messages? Why? This does not look scalable.

Infact when I go over 120K chares it crashes with segfault (_pmiu_daemon(SIGCHLD): [NID 16939] [c5-0c2s5n1] [Mon Aug 1 03:12:58 2016] PE RANK 975 exit signal Segmentation fault).

Am I building or running improperly?

How can I make sure that the chares are spread on more nodes and procs in order to avoid crazy memory allocation on a few nodes?

Is there any strong coupling between the chare who creates a chare array and their actual execution nodes/procs? If I create more (smaller) chare arrays in the main chare at different execution times, instead of one large at the beginning, could it change anything?

Thank you,

Steve

[charm] Scalability issues using large chare array, Steve Petruzza, 08/01/2016
- Re: [charm] Scalability issues using large chare array, Phil Miller, 08/01/2016
- Re: [charm] Scalability issues using large chare array, Phil Miller, 08/01/2016
  - Re: [charm] Scalability issues using large chare array, Steve Petruzza, 08/02/2016
    - Re: [charm] Scalability issues using large chare array, Phil Miller, 08/03/2016
      - Re: [charm] Scalability issues using large chare array, Steve Petruzza, 08/04/2016
        
        Re: [charm] Scalability issues using large chare array, Phil Miller, 08/04/2016
        
        Re: [charm] Scalability issues using large chare array, Steve Petruzza, 08/08/2016