Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] load balancer question (freeze/crash)

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] load balancer question (freeze/crash)


Chronological Thread 
  • From: Evghenii Gaburov <e-gaburov AT northwestern.edu>
  • To: Pritish Jetley <pjetley2 AT illinois.edu>
  • Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>, Gengbin Zheng <zhenggb AT gmail.com>
  • Subject: Re: [charm] [ppl] load balancer question (freeze/crash)
  • Date: Tue, 4 Oct 2011 18:48:26 +0000
  • Accept-language: en-US
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

> > Also look at possible race conditions in the code. For example, after
> > calling AtSync() (assuming you are using periodic load balancing), the
> > caller should not send new messages. It should wait for the resume
> > from resumefromSync() call.
Regarding the Race conditions, is possible to have any in the following code:

...

[threaded] void Main::startSimulation()
{
...
CkWaitQD();
arrayProxy.doAtSync();
CkWaitQD();
...
}

void myChare::doAtSync()
{
AtSync();
}

void myChare::ResumeFromSync()
{
}

...

Cause this does not prevent freezes either. I am continuing bug-hunting...

Thanks!

Cheers,
Evghenii


> Okay, I will double check that.
>
> Thanks,
> Evghenii
>
> >
> > Gengbin
> >
> > On Tue, Oct 4, 2011 at 10:43 AM, Evghenii Gaburov
> > <e-gaburov AT northwestern.edu>
> > wrote:
> >>> This program does not PUP the MainCB callback member variable
> >>> Variables which are not PUP'd will not retain their value after
> >>> migration. Therefore every migrated element will be calling an
> >>> uninitialized callback in ResumeFromSync.
> >> So, the freeze still occur even after MainCB is passed to PUP.
> >>
> >> The test program I posted in the previous listing
> >> sometimes freezes with Greedy[Comm]LB, Refine[Comm]LB & MetisLB, but not
> >> with RotateLB,
> >>
> >> when ckAboutToMigrate() & ckJustMigrated() are defined.
> >>
> >> #if 1
> >> void ckAboutToMigrate() {}
> >> void ckJustMigrated() {}
> >> #endif
> >>
> >> Any idea what may happen here?
> >>
> >> While in my simulation code I do not use these, I still experience
> >> freezes at ResumeFromSync()
> >> after having the code run for about an hour and after a dozens of
> >> AtSync() calls. I cannot reproduce
> >> this behaviour in that simple test code, but may be this is related to
> >> the fact that in production code
> >> I move a lot of data...
> >>
> >> Any help will be of great value!
> >>
> >> Cheers,
> >> Evghenii
> >>
> >>
> >>
> >> --
> >> Evghenii Gaburov,
> >> e-gaburov AT northwestern.edu
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> charm mailing list
> >> charm AT cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/charm
> >> _______________________________________________
> >> ppl mailing list
> >> ppl AT cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/ppl
> >>
>
> --
> Evghenii Gaburov,
> e-gaburov AT northwestern.edu
>
>
>
>
>
>
>
> _______________________________________________
> charm mailing list
> charm AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/charm
> _______________________________________________
> ppl mailing list
> ppl AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/ppl
>
>
>
> --
> Pritish Jetley
> Doctoral Candidate, Computer Science
> University of Illinois at Urbana-Champaign

--
Evghenii Gaburov,
e-gaburov AT northwestern.edu











Archive powered by MHonArc 2.6.16.

Top of Page