Skip to Content.
Sympa Menu

charm - [charm] Program hang when using load balancing and lots of PEs

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] Program hang when using load balancing and lots of PEs


Chronological Thread 
  • From: Robert Steinke <rsteinke AT uwyo.edu>
  • To: Charm Mailing List <charm AT cs.illinois.edu>
  • Subject: [charm] Program hang when using load balancing and lots of PEs
  • Date: Tue, 27 Jan 2015 14:51:22 -0700
  • Authentication-results: cs.illinois.edu; dkim=none (message not signed) header.d=none; cs.illinois.edu; dmarc=none action=none header.from=uwyo.edu;
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

I have a program that hangs when I run on lots of PEs and use the load balancer (I'm using MetisLB). If I run on 512 or fewer processors it is fine. If I try to run on 1024 processors it hangs shortly after I call CkStartLB (I'm using TurnManualLBOn). Also, if I don't call CkStartLB(); it runs fine on 1024 processors.

Is this a problem that someone else has encountered before?

Is this something that I should try to dig into, or is there someone else more familiar with the load balancer than I am who is willing to look into it, in which case I will apply my effort to creating a minimal test case that reproduces the problem.

Thanks
Bob Steinke





Archive powered by MHonArc 2.6.16.

Top of Page