charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] Introduction

From: Elliott Slaughter <slaughter AT cs.stanford.edu>
To: "Van Der Wijngaart, Rob F" <rob.f.van.der.wijngaart AT intel.com>
Cc: Sam White <white67 AT illinois.edu>, Phil Miller <mille121 AT illinois.edu>, "Kale, Laxmikant V" <kale AT illinois.edu>, "Chandrasekar, Kavitha" <kchndrs2 AT illinois.edu>, "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
Subject: Re: [charm] Introduction
Date: Fri, 20 Oct 2017 14:33:31 -0700
Authentication-results: illinois.edu; spf=softfail smtp.mailfrom=slaughter AT cs.stanford.edu

Thanks Rob for the introduction.

I mostly just wanted to sanity check my configuration to make sure I'm doing things the Right Way (tm).

I downloaded Charm++ 6.8.1 and built with the following command. This is on Piz Daint, a Cray XC40/50 system.

module load PrgEnv-intel # and unload any other PrgEnv-*
module load craype-hugepages8M
./build charm++ gni-crayxc smp --with-production -j8

I wasn't sure about the SMP part, but Rob had talked about Charm++ having a dedicated core for communication, and I think this is the setting I need to get that configuration.

I set CHARMTOP inside PRK's make.defs file, but otherwise left the settings the same as the other apps. (I.e. -O3 and so on.)

My run command looks like the following, where $n is the number of nodes and $d is the decomposition factor. The nodes have 12 physical cores per node, so this leaves 2 extra cores for whatever extra threads Charm++ wants to use. The stencil code is memory bound so I've found that even with MPI/OpenMP filling up all the cores isn't generally beneficial.

srun -n $n -N $n --ntasks-per-node 1 --cpu_bind none stencil +ppn 10 +setcpuaffinity 100 40000 $d

If anything about this configuration looks wrong, or if I'm missing any important settings (or there are settings where I should explore the performance impact of different options), please let me know.

On Fri, Oct 20, 2017 at 1:56 PM, Van Der Wijngaart, Rob F <rob.f.van.der.wijngaart AT intel.com> wrote:

Hello Team,

I wanted to introduce you to Elliott Slaughter, a freshly minted PhD in computer science from Stanford, and member of the Legion team. He had some questions for me about optimal choice of configuration, compiler, and runtime parameters when building Charm++ and executing Charm++ workloads, especially the Parallel Research Kernels. I gave some generic advice, but would like to ask you (or those of you who are still at UIUC) to help him optimize his execution environment. Thanks!

Rob

Elliott Slaughter

"Don't worry about what anybody else is going to do. The best way to predict the future is to invent it." - Alan Kay

[charm] Introduction, Van Der Wijngaart, Rob F, 10/20/2017
- Re: [charm] Introduction, Elliott Slaughter, 10/20/2017
  - Re: [charm] Introduction, Elliott Slaughter, 10/20/2017
    - RE: [charm] Introduction, Van Der Wijngaart, Rob F, 10/20/2017
      - Re: [charm] Introduction, Phil Miller, 10/20/2017
        
        Re: [charm] Introduction, Elliott Slaughter, 10/20/2017
        
        RE: [charm] Introduction, Van Der Wijngaart, Rob F, 10/20/2017