charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] mis-matched client callbacks in reduction messages

From: Phil Miller <mille121 AT illinois.edu>
To: Jozsef Bakosi <jbakosi AT lanl.gov>
Cc: "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>
Subject: Re: [charm] mis-matched client callbacks in reduction messages
Date: Fri, 27 Oct 2017 11:02:54 -0500
Authentication-results: illinois.edu; spf=pass smtp.mailfrom=unmobile AT gmail.com

On Fri, Oct 27, 2017 at 9:55 AM, Jozsef Bakosi <jbakosi AT lanl.gov> wrote:

Hello Charm++ developers,

Hi Jozsef,

I wonder if I could request a feature.

We've got an implementation technique that we use internally that should hopefully suffice. I'll describe it below.

As far as I understand, there can be no more than one simultaneous reductions in
flight originating from some chare array targeting the some single chare (but
different reduction targets). Also multiple reductions in such fashion are okay
as long as the reductions are always guaranteed to be invoked in the same order.
As I understand, this a limitation on purpose for performance reasons.

This overstates the restriction a bit. Within each chare array, all elements have to contribute to reductions in the same order. Nothing about their targets, nothing about multiple reductions in flight.

However, I find myself running into the above runtime error (which I believe I
get due to this issue) more than I would like to and this is because (I believe)
I initiate such multiple reductions based on SDAG constructs becoming ready at
different times, so I cannot always guarantee the same order.

So far I have been able to work around this and my solution is almost always
having to put in some synchronization (usually global reductions), which is of
course far from ideal. Also, even if such solutions appear correct, this is
pretty hard to do on a complex SDAG, and (2) it is almost impossible to test its
correctness as the behavior is pretty stochastic. I believe I have run into such
problems every once in a few months in the past couple of years.

To remedy this I would like an option to be able to create reductions for which
this is not a problem. Of course, I am willing to trade performance for
correctness for such cases.

Is such a special reduction feasible? Would that be a lot of work for you guys
to implement and maintain?

We use an approach of creating bound 'shadow' arrays to act as independent reduction (sequencing) contexts to address this limitation. We've used this approach in a few places in our code, including the LiveViz in-situ visualization library and the collision detection library.

In a little more detail, when constructing a chare array, it's possible to specify that it should be bound to another existing chare array. That means that elements of the same index will always live on the same PE. So, you can instantiate some auxiliary arrays, one per reduction stream, and bind them to your main computation arrays. Since elements with corresponding indices are guaranteed to be co-located, the main element can get a pointer to each auxiliary via a ckLocal() call, and then call aux->contribute(...) rather than implicitly this->contribute(). So, the setup code get a bit more complicated, and the code actually invoking the reductions get just a little more involved.

Is that a clear description? Does that approach work for you?

Phil

[charm] mis-matched client callbacks in reduction messages, Jozsef Bakosi, 10/27/2017
- Re: [charm] mis-matched client callbacks in reduction messages, Phil Miller, 10/27/2017
  - Re: [charm] mis-matched client callbacks in reduction messages, Jozsef Bakosi, 10/27/2017
    - Re: [charm] mis-matched client callbacks in reduction messages, Jozsef Bakosi, 10/27/2017
    - Re: [charm] mis-matched client callbacks in reduction messages, Jozsef Bakosi, 10/29/2017