Skip to Content.
Sympa Menu

charm - [charm] intermittent hang on reduction

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] intermittent hang on reduction


Chronological Thread 
  • From: Robert Steinke <rsteinke AT uwyo.edu>
  • To: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: [charm] intermittent hang on reduction
  • Date: Wed, 18 Feb 2015 11:39:28 -0700
  • Authentication-results: cs.uiuc.edu; dkim=none (message not signed) header.d=none;
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

I've been having an intermittent problem where my code hangs. I've traced down where it is happening. The problem is with a reduction. Every array element calls contribute, but the reduction target never gets called. It's intermittent, but when it happens it always happens on the first time the reduction is performed after load balancing. It happens more often on larger numbers of processors.

What is the best way to debug this? Can I look at what the Charm code thinks the state of the reduction is such as how many elements have contributed and how many are expected to contribute? Is there a trace level option I should use, or is there somewhere in the .def.h code where I should stick a breakpoint?

thanks,
Bob Steinke




  • [charm] intermittent hang on reduction, Robert Steinke, 02/18/2015

Archive powered by MHonArc 2.6.16.

Top of Page