charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] Fw: question about charm++ IP address definition as part of cputopology.C

From: "Kale, Laxmikant V" <kale AT illinois.edu>
To: "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>
Cc: "Buch, Ronak Akshay" <rabuch2 AT illinois.edu>, "Choi, Jaemin" <jchoi157 AT illinois.edu>
Subject: [charm] Fw: question about charm++ IP address definition as part of cputopology.C
Date: Thu, 9 Sep 2021 15:36:43 +0000
Accept-language: en-US
Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=illinois.edu; dmarc=pass action=none header.from=illinois.edu; dkim=pass header.d=illinois.edu; arc=none
Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=CTBQfCVkaZSLSKjPXtuQ8/R4qrE+6kxTMJ+IIgKEmjU=; b=DQ+SnJkFT2yKIrtpZo265EFXLyfeWKMZjP1GQRyDsedDDzcfA9fcbiNmYez5GPEfsqgeviJ0dZIwhLbF3fe0bbtAkWzQNd7pOt1bycG1Fjh3qUbqXEeiyfuH1pGc+0naNYCTmjFPPfUS53kdIX6usboc0fRZVuGiWx6ptJCsOVaN7nxo1kRGRybjXuf9Ntf96LPZvKL+kG2WpKLyoMasSATz2umymYEHhcfNoYmiTfFOLF+RERW57afQQAJhIkGq3HggvyvuQ4tld8DQAJ+AWuQPezMiwqBLMfdM0hk78cu3tYCeZWo2Ow99v0RXGPzlIdSqm/URZyImSXoIGPMJHA==
Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=URyXHDjcTMtvlOgfbu5L9c0h93rggpmPeXqbdrYu/fnqRFVbnTkciRxhMHO7yhk9PPqc3GLvK/TJNOBI5FPCnPsM0QeBDz+G1HNngpPtj15MrG38yTQe/WeHFiMThgeGy6iMgxxlJWRJBnh15MdVPjsjfKJrQHhNbh3Xhkt98AwuXkEzN8clhgI/oQVHuX9k8/WGI7jDBh45mb2LN9Z0MKPzuLzrUVpIk8589eFQRCxKOrNjZCXq/bRUaBNZuzeogpyEkx8xfD/YhJsnUs8DpTaWnTw6LhT5VrbL4U+Kbg8QhO1bPLWu4bbA0SSjpBw2UK3b6A9XFyrtc2SUKxGbNw==
Authentication-results: ppops.net; spf=softfail smtp.mailfrom=kale AT illinois.edu; dkim=pass header.s=selector2-uillinoisedu-onmicrosoft-com header.d=uillinoisedu.onmicrosoft.com
Authentication-results: lists.cs.illinois.edu; dkim=none (message not signed) header.d=none;lists.cs.illinois.edu; dmarc=none action=none header.from=illinois.edu;
Suggested_attachment_session_id: c4bbea17-4b77-d2e2-981c-7e8eb358161a

Forwarding to the Charm++ mailing list. Ronak and Jaemin will be initial points of contact for this issue.

--
Laxmikant Kale                  http://charm.cs.uiuc.edu
Professor, Computer Science     kale AT cs.uiuc.edu
201 N. Goodwin Avenue           Ph:  (217) 244-0094
Urbana, IL  61801-2302          FAX: (217) 265-6582

From: Carrier, Pierre <pierre.carrier AT hpe.com>
Sent: Wednesday, September 8, 2021 2:56 PM
To: Kale, Laxmikant V <kale AT illinois.edu>
Cc: McMahon, Kim <kim.mcmahon AT hpe.com>; Gilmer, Brian F <brian.gilmer AT hpe.com>; Warren, Steven <steven.warren AT hpe.com>
Subject: question about charm++ IP address definition as part of cputopology.C

Hi Prof. Kale,

I work at HPE on some NAMD benchmarks, with others (in CC) that are currently trying to resolve the following problem. On one of our systems the number of nodes is incorrect when trying to run with 4 GPU per node. For example, I get the following output

Charm++> Running in SMP mode: 32 processes, 4 worker threads (PEs) + 1 comm threads per process, 128 PEs total

Charm++> Running on 15 hosts (1 sockets x 64 cores x 2 PUs = 128-way SMP)

...where the SLURM script uses the following syntax:

#SBATCH --nodes=8

...

srun --ntasks=32 --ntasks-per-node=4 –cpu-bind=none \

${NAMD_PATH}/namd2 ++ppn 4 +devices 0,1,2,3 ${INPUT_PATH}/chromat100-bench.namd &> namd.log

Using that subdivision, I expect to be running on 8 nodes. Following that error, the output becomes:

FATAL ERROR: Number of devices (4) is not a multiple of number of processes (3). Sharing devices between processes is inefficient. Specify +ignoresharing (each process uses all visible devices) if not all devices are visible to each process, otherwise adjust number of processes to evenly divide number of devices, specify subset of devices with +devices argument (e.g., +devices 0,2), or multiply list shared devices (e.g., +devices 0,1,2,0).

Which is just a consequence of the fact that the number of nodes numNodes is incorrect.

I could trace the error down to the variable “topomsg->nodes”, which is incorrectly computed, and hostTable.size(), at the line where printTopology is called:

/* called on PE 0 */

static void cpuTopoHandler(void *m)

{

_procInfo *rec;

hostnameMsg *msg = (hostnameMsg *)m;

int pe;

if (topomsg == NULL) {

int i;

topomsg = (nodeTopoMsg *)CmiAlloc(sizeof(nodeTopoMsg)+CmiNumPes()*sizeof(int));

CmiSetHandler((char *)topomsg, CpvAccess(cpuTopoRecvHandlerIdx));

topomsg->nodes = (int *)((char*)topomsg + sizeof(nodeTopoMsg));

for (i=0; i<CmiNumPes(); i++) topomsg->nodes[i] = -1;

}

CmiAssert(topomsg != NULL);

msg->procs = (_procInfo*)((char*)msg + sizeof(hostnameMsg));

CmiAssert(msg->n == CmiNumPes());

for (int i=0; i<msg->n; i++)

{

_procInfo *proc = msg->procs+i;

/* for debug

skt_print_ip(str, msg->ip);

printf("hostname: %d %s\n", msg->pe, str);

skt_ip_t & ip = proc->ip;

pe = proc->pe;

auto iter = hostTable.find(ip);

if (iter != hostTable.end()) {

rec = iter->second;

}

else {

proc->nodeID = pe; // we will compact the node ID later

rec = proc;

hostTable.emplace(ip, proc);

}

topomsg->nodes[pe] = rec->nodeID;

rec->rank ++;

}

////////// for (int i=0; i<CmiNumPes(); i++) topomsg->nodes[i] = -1;

////////// for (int i=0; i < 16; i++) {

////////// topomsg->nodes[i + 0] = 0;

////////// topomsg->nodes[i + 16] = 16;

////////// topomsg->nodes[i + 32] = 40;

////////// topomsg->nodes[i + 48] = 60;

////////// topomsg->nodes[i + 64] = 76;

////////// topomsg->nodes[i + 80] = 80;

////////// topomsg->nodes[i + 96] = 108;

////////// topomsg->nodes[i + 112] = 116;

////////// }

for (int i=0; i < CmiNumPes(); i++) {

printf("DEBUG PIERRE topomsg->nodes[%d]=%d\n", i, topomsg->nodes[i]);

}

printTopology(hostTable.size());

hostTable.clear();

CmiFree(msg);

CmiSyncBroadcastAllAndFree(sizeof(nodeTopoMsg)+CmiNumPes()*sizeof(int), (char *)topomsg);

}

The comments that I added are the values I’m supposed to have when running on a different system that is configured differently with SLURM but can run correctly.

Could you please direct me to someone that can explain the principles of this part of the charm++ code, in particular, what variables are read from the system (SLURM?) in order to define the proc->ip and nodeIDs? And the numNodes.

That part of the charm++ program was done by:

/** This scheme relies on using IP address to identify physical nodes

* written by Gengbin Zheng 9/2008

...but I believe that he is now at Intel, if LinkedIn is up-to-date.

Thank you for your help.

Best regards,

Pierre

Pierre Carrier, Ph.D.

Apps & Performance Engineering

pierre.carrier AT hpe.com

(651)354-3570

[charm] Fw: question about charm++ IP address definition as part of cputopology.C, Kale, Laxmikant V, 09/09/2021
- <Possible follow-up(s)>
- [charm] Fw: question about charm++ IP address definition as part of cputopology.C, Kale, Laxmikant V, 09/09/2021
  - RE: [charm] question about charm++ IP address definition as part of cputopology.C, Choi, Jaemin, 09/10/2021