Skip to Content.
Sympa Menu

charm - [charm] mis-matched client callbacks in reduction messages

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] mis-matched client callbacks in reduction messages


Chronological Thread 
  • From: Jozsef Bakosi <jbakosi AT lanl.gov>
  • To: "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>
  • Subject: [charm] mis-matched client callbacks in reduction messages
  • Date: Fri, 27 Oct 2017 08:55:55 -0600
  • Authentication-results: illinois.edu; spf=pass smtp.mailfrom=jbakosi AT lanl.gov

Hello Charm++ developers,

I wonder if I could request a feature.

As far as I understand, there can be no more than one simultaneous reductions
in
flight originating from some chare array targeting the some single chare (but
different reduction targets). Also multiple reductions in such fashion are
okay
as long as the reductions are always guaranteed to be invoked in the same
order.
As I understand, this a limitation on purpose for performance reasons.

However, I find myself running into the above runtime error (which I believe I
get due to this issue) more than I would like to and this is because (I
believe)
I initiate such multiple reductions based on SDAG constructs becoming ready at
different times, so I cannot always guarantee the same order.

So far I have been able to work around this and my solution is almost always
having to put in some synchronization (usually global reductions), which is of
course far from ideal. Also, even if such solutions appear correct, this is
pretty hard to do on a complex SDAG, and (2) it is almost impossible to test
its
correctness as the behavior is pretty stochastic. I believe I have run into
such
problems every once in a few months in the past couple of years.

To remedy this I would like an option to be able to create reductions for
which
this is not a problem. Of course, I am willing to trade performance for
correctness for such cases.

Is such a special reduction feasible? Would that be a lot of work for you guys
to implement and maintain?

Thanks,
Jozsef



Archive powered by MHonArc 2.6.19.

Top of Page