Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] load balancer question (freeze/crash)

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] load balancer question (freeze/crash)


Chronological Thread 
  • From: Evghenii Gaburov <e-gaburov AT northwestern.edu>
  • To: Eric Bohm <ebohm AT illinois.edu>
  • Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: Re: [charm] [ppl] load balancer question (freeze/crash)
  • Date: Tue, 4 Oct 2011 15:43:41 +0000
  • Accept-language: en-US
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

> This program does not PUP the MainCB callback member variable
> Variables which are not PUP'd will not retain their value after
> migration. Therefore every migrated element will be calling an
> uninitialized callback in ResumeFromSync.
So, the freeze still occur even after MainCB is passed to PUP.

The test program I posted in the previous listing
sometimes freezes with Greedy[Comm]LB, Refine[Comm]LB & MetisLB, but not with
RotateLB,

when ckAboutToMigrate() & ckJustMigrated() are defined.

#if 1
void ckAboutToMigrate() {}
void ckJustMigrated() {}
#endif

Any idea what may happen here?

While in my simulation code I do not use these, I still experience freezes at
ResumeFromSync()
after having the code run for about an hour and after a dozens of AtSync()
calls. I cannot reproduce
this behaviour in that simple test code, but may be this is related to the
fact that in production code
I move a lot of data...

Any help will be of great value!

Cheers,
Evghenii



--
Evghenii Gaburov,
e-gaburov AT northwestern.edu











Archive powered by MHonArc 2.6.16.

Top of Page