Skip to Content.
Sympa Menu

charm - Re: [charm] [EXTERNAL] Re: Detect restart from app

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [EXTERNAL] Re: Detect restart from app


Chronological Thread 
  • From: "White, Samuel T" <white67 AT illinois.edu>
  • To: Tom Quinn <trq AT astro.washington.edu>, Jozsef Bakosi <jbakosi AT lanl.gov>
  • Cc: "charm AT cs.illinois.edu" <charm AT cs.illinois.edu>
  • Subject: Re: [charm] [EXTERNAL] Re: Detect restart from app
  • Date: Thu, 26 Mar 2020 18:45:02 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=illinois.edu; dmarc=pass action=none header.from=illinois.edu; dkim=pass header.d=illinois.edu; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=b16DQm9W077EglhHK7LF3Hxbpzx+juj77zBadFCzY/s=; b=oPZx9a/o9ByA6sRVzb2csneKIetmRpq+jGwfQ5jVu5/zy5iGZ1fKfSkjS/M1tpXnLa1tFtQy5zxe1XBJyanJQJb3uFpdLe5lCFThFcXABMR4SX0uJji4dG50R0Y0kb1z7EtnuMnvanH5fano6SDAn33wt//JgCCAU0TyQAeNZgLaSmAKsRPoXVJACPAzMBFf7jwQK9ZchSwljOKibauXNXXerKuDnM5x45dcAuzgcTVy7ygKTqiDxjEIid+s7eMIgMK7XPg5KURpRq6+5V6D3LHSOAqsmZPYQ1w4x+VY0jQdrvl5U4okoJ8Vz0O702rOPIpZ3231b637jGBYXY35PQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=G9EBrJEY8KLA8g+xVvmutNtsO5AB0mb0R2qWhIvvZqYIM6731/FNDwXCM89G/rzZ8XSzBQdFVs3QQRzcmpoyExfjXQj7I6QyiCwQ9CLb8TzceZ3JWNnoHIR1UJ88ZmyPI6j/KIGkTBcMvgwWGFlzIynbHTNcLkQX66h/pV/+Y/C+RWqdSvWe1GziYqnAbqK9yTdWsHRyxnAQSbZD562plbQDVwDJC2LK/8j4ccL7ldMGY5pDQ3232WS6toyiIPhfCLS09/exu5gFfE004TkIEZYvC1UJ28uas53VAX0pCjLxoUOyZN6MiA/wwxa0uTTCcP2mvedPldQXpOR/4WJwwA==
  • Authentication-results: illinois.edu; spf=neutral smtp.mailfrom=white67 AT illinois.edu; dkim=pass header.d=uillinoisedu.onmicrosoft.com header.s=selector2-uillinoisedu-onmicrosoft-com; dmarc=none header.from=illinois.edu

We also have a callback that is invoked by the runtime only when restoring from a checkpoint that is defined for migratable objects as "virtual void ckJustRestored(void);" akin to ckAboutToMigrate() and ckJustMigrated().

Regarding your question about mainchares, as a matter of implementation detail the mainchare is guaranteed to reside on PE 0 currently. We don't see a reason to break now, but could one day so you might not want to rely on it.

-Sam

From: Jozsef Bakosi <jbakosi AT lanl.gov>
Sent: Thursday, March 19, 2020 3:30 PM
To: Tom Quinn <trq AT astro.washington.edu>
Cc: charm AT cs.illinois.edu <charm AT cs.illinois.edu>
Subject: Re: [charm] [EXTERNAL] Re: Detect restart from app
 
I manged to pass an indicator from the Main ctor to where I need it via
a global variable. This is not ideal but I guess it works. With this
solution though, I need to ask the following:

Is the Main chare guaranteed to always reside on PE 0?

If that's the case, this solution is okay.

J

On 03.19.2020 11:13, Jozsef Bakosi wrote:
> Thanks, Tom,
>
> I noticed this too. But I the "distance" between my
> Main::Main(CkMigrateMessage* m) and CkStartCheckpoint() is pretty large
> and I don't have a clean way of passing in an indicator, so I was hoping
> to get that out of Charm++ via some API call.
>
> Jozsef
>
> On 03.19.2020 09:21, Tom Quinn wrote:
> > What I do in ChaNGa is:
> >
> > The Main object has an attribute "bIsRestarting".   The Main::Main(CkArgMsg*
> > m) constructor sets this to "0", while the Main::Main(CkMigrateMessage* m)
> > constructor (which is only called if restarting from a checkpoint) sets it
> > to "1".  The callback method you pass to CkStartCheckpoint() can then test
> > on "bIsRestarting".
> >
> > Tom Quinn   Astronomy, University of Washington
> > Internet:   trq AT astro.washington.edu
> > Phone:              206-685-9009
> >
> > On Thu, 19 Mar 2020, Jozsef Bakosi wrote:
> >
> > > Hi folks,
> > >
> > > Is there a way to detect from a Charm++ app that it just came back from
> > > a restart and not just finished a checkpoint?
> > >
> > > According to the manual:
> > >
> > > cb will be invoked after the checkpoint is done, as well as when the
> > > restart is complete:
> > >
> > > CkCallback cb(CkIndex_Hello::SayHi(), helloProxy);
> > > CkStartCheckpoint("log", cb);
> > >
> > > When cb is called, I'd like to zero some counters if and only if it just
> > > came back from a restart.
> > >
> > > Thanks,
> > > Jozsef



Archive powered by MHonArc 2.6.19.

Top of Page