Skip to Content.
Sympa Menu

charm - Re: [charm] catching exceptions

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] catching exceptions


Chronological Thread 
  • From: Jozsef Bakosi <jbakosi AT lanl.gov>
  • To: Matthias Diener <mdiener AT illinois.edu>
  • Cc: charm AT lists.cs.illinois.edu
  • Subject: Re: [charm] catching exceptions
  • Date: Fri, 13 Jul 2018 16:18:39 -0600
  • Authentication-results: illinois.edu; spf=pass smtp.mailfrom=jbakosi AT lanl.gov; dmarc=pass header.from=lanl.gov

Very nice! Thanks, Matthias, I will give it a try.

Jozsef

On 07.13.2018 14:07, Matthias Diener wrote:
> Hi Jozsef,
>
> We recently implemented a way to pass optional exit codes to CkExit.
> Using the current git version of charm++, you can use CkExit(int
> exitcode) to pass a nonzero exit code during termination.
>
> Regards,
> Matthias
>
> On Thu, 17 May 2018 at 16:13, Matthias Diener
> <mdiener AT illinois.edu>
> wrote:
> >
> > Hi Jozsef,
> >
> > Unfortunately, I think there is currently no way to pass a custom exit
> > code
> > to CkExit() or CkAbort().
> > We have an open bug report for this (
> > https://charm.cs.illinois.edu/redmine/issues/1584).
> >
> > As a temporary workaround, you could try calling abort() on an error
> > condition.
> >
> > Regards,
> > Matthias
> >
> > On Thu, 17 May 2018 at 16:07, Jozsef Bakosi
> > <jbakosi AT lanl.gov>
> > wrote:
> >
> > > Thanks, Matthias, and Nils, that works.
> >
> > > If I don't call CkAbort(), is there any other way to terminate a Charm++
> > program
> > > so I can get a nonzero exit code? CkAbort returns 1, which I would like
> > to avoid
> > > because I'm generating my own backtrace and CkExit() returns 0. Ideally,
> > e.g., I
> > > would like to pass a nonzero exit code CkExit().
> >
> > > Jozsef
> >
> > > On 05.16.2018 12:10, Matthias Diener wrote:
> > > > Hi Jozsef,
> > > >
> > > > A bit of context regarding the second part of your question:
> > > > Since https://charm.cs.illinois.edu/gerrit/#/c/charm/+/3720/ , Charm++
> > > > calls CkAbort() when there are unhandled C++ exceptions in user code.
> > > > Charm++ itself does not use exceptions.
> > > >
> > > > You could override Charm++'s uncaught exception handler in the
> > mainchare,
> > > > for
> > > > example, by specifying your own set_terminate handler:
> > > >
> > > > // in Main():
> > > > std::set_terminate([](){ CkPrintf("Warning: Unhandled exception in
> > > > user
> > > > code.\n"); abort();});
> > > >
> > > > Regards,
> > > > Matthias
> > > > On Wed, 16 May 2018 at 08:09, Jozsef Bakosi
> > > > <jbakosi AT lanl.gov>
> > > > wrote:
> > > >
> > > > > Hi folks,
> > > >
> > > > > When exceptions are thrown from chare arrays or groups, how/where
> > > > > one
> > is
> > > > > supposed to catch them?
> > > >
> > > > > Ideally, I would like to catch exceptions at the outermost layer,
> > e.g.,
> > > > outside
> > > > > of some driver, from where everything else is instantiated and
> > > > > called,
> > > > but that
> > > > > way not every exception is caught.
> > > >
> > > > > I attach an example with the simplearrayhello test augmented and
> > commented
> > > > > highlighting the problem/question.
> > > >
> > > > > In addition, I also set signal handlers to throw exceptions (and
> > produce
> > > > nice
> > > > > stack traces) when, e.g., segfaults happen and redirect them to the
> > same
> > > > code
> > > > > that handles exceptions and produce stack traces, but when
> > > > > exceptions
> > are
> > > > not
> > > > > caught that is useless.
> > > >
> > > > > Alternatively, is it possible to override Charm++'s exception
> > handling,
> > > > that
> > > > > sometimes produces outputs like the following:
> > > >
> > > > > ------------- Processor 0 Exiting: Called CmiAbort ------------
> > > > > Reason: Unhandled C++ exception in user code.
> > > >
> > > > > [0] Stack Traceback:
> > > > > [0:0] [0x592cfe]
> > > > > [0:1] [0x58f9f8]
> > > > > [0:2] [0x41b908]
> > > > > [0:3] [0x41b8e9]
> > > > > [0:4] +0x22763 [0x7ffff736d763]
> > > > > [0:5] +0x25516 [0x7ffff7370516]
> > > > > [0:6] +0x254af [0x7ffff73704af]
> > > > > [0:7] [0x414829]
> > > > > [0:8] [0x413091]
> > > > > [0:9] [0x434f30]
> > > > > [0:10] [0x46f99a]
> > > > > [0:11] [0x440c2d]
> > > > > [0:12] [0x437491]
> > > > > [0:13] [0x436e11]
> > > > > [0:14] [0x596417]
> > > > > [0:15] [0x596744]
> > > > > [0:16] [0x59646a]
> > > > > [0:17] [0x5929d1]
> > > > > [0:18] [0x592283]
> > > > > [0:19] [0x417c95]
> > > > > [0:20] __libc_start_main+0xe7 [0x7ffff621aa87]
> > > > > [0:21] [0x411f1a]
> > > > >
> > --------------------------------------------------------------------------
> > > > > MPI_ABORT was invoked on rank 0 in communicator MPI COMMUNICATOR 3
> > > > > DUP
> > > > FROM 0
> > > > > with errorcode 1.
> > > >
> > > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> > > > > You may or may not see output from other processes, depending on
> > > > > exactly when Open MPI kills them.
> > > > >
> > --------------------------------------------------------------------------
> > > >
> > > > > Thanks,
> > > > > Jozsef



Archive powered by MHonArc 2.6.19.

Top of Page