Skip to Content.
Sympa Menu

charm - [charm] Incorrect T-Dimension Size Information

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] Incorrect T-Dimension Size Information


Chronological Thread 
  • From: Chris Wailes <chris.wailes AT gmail.com>
  • To: charm <charm AT lists.cs.illinois.edu>
  • Subject: [charm] Incorrect T-Dimension Size Information
  • Date: Tue, 20 Mar 2018 10:37:16 -0400
  • Authentication-results: illinois.edu; spf=pass smtp.mailfrom=chris.wailes AT gmail.com; dkim=pass header.d=gmail.com header.s=20161025; dmarc=pass header.from=gmail.com

I am attempting to use Charm on a Cray XE6 machine with 16-Core AMD Abu Dhabi chips. The way this machine is set up the job management system treats a single CPU as a node with 32 processing elements (16 physical cores / 32 logical cores).

I've been able to run programs from the test/ and examples/ directories using core counts from 1 to 128 (across 4 of the job manager's nodes).  Unfortunately the size of the T dimension as reported by the TopoManager is always 32, instead of the correct value of 128.

This seems to indicate that one of three things is happening:
  1. The part of Charm++ responsible for assigning jobs has the correct size of the T-Dimension that it uses, and there is simply a discrepancy between that value and the value reported from the TopoManager.

  2. The part of Charm++ responsible for assigning jobs also believes that the T-Dimension is only 32, and as a result work is only being allocated to the first 32 processing elements connected to the router.  Everything works fine, but only a quarter of the available resources are being used.

  3. Different parts of the Charm++ runtime have different ideas of what the T-Dimension size is.  Given a chance, the runtime might try and assign a Char to a PE with a T-coordinate >= 32 (assuming 0 indexing) causing a runtime error/exception but I have been lucky enough not to encounter this yet.

My questions then are: which of these three scenarios are occurring and how do I get the TopoManager to report the correct size for the T dimension?

- Chris




Archive powered by MHonArc 2.6.19.

Top of Page