Skip to Content.
Sympa Menu

charm - Re: [charm] mis-matched client callbacks in reduction messages

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] mis-matched client callbacks in reduction messages


Chronological Thread 
  • From: Jozsef Bakosi <jbakosi AT lanl.gov>
  • To: Phil Miller <mille121 AT illinois.edu>
  • Cc: "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>
  • Subject: Re: [charm] mis-matched client callbacks in reduction messages
  • Date: Fri, 27 Oct 2017 11:38:56 -0600
  • Authentication-results: illinois.edu; spf=pass smtp.mailfrom=jbakosi AT lanl.gov

On 10.27.2017 11:02, Phil Miller wrote:
> On Fri, Oct 27, 2017 at 9:55 AM, Jozsef Bakosi
> <[1]jbakosi AT lanl.gov>
> wrote:
>
> Hello Charm++ developers,
>
> Hi Jozsef,
>
> I wonder if I could request a feature.
>
> We've got an implementation technique that we use internally that
> should hopefully suffice. I'll describe it below.
>
> As far as I understand, there can be no more than one simultaneous
> reductions in
> flight originating from some chare array targeting the some single
> chare (but
> different reduction targets). Also multiple reductions in such
> fashion are okay
> as long as the reductions are always guaranteed to be invoked in the
> same order.
> As I understand, this a limitation on purpose for performance
> reasons.
>
> This overstates the restriction a bit. Within each chare array, all
> elements have to contribute to reductions in the same order. Nothing
> about their targets, nothing about multiple reductions in flight.

Okay, thanks for the clarification. I think even with this in mind, I may have
reductions that contribute in different order I get this behavior especially
when running with randomized message queues.

>
> However, I find myself running into the above runtime error (which I
> believe I
> get due to this issue) more than I would like to and this is because
> (I believe)
> I initiate such multiple reductions based on SDAG constructs
> becoming ready at
> different times, so I cannot always guarantee the same order.
> So far I have been able to work around this and my solution is
> almost always
> having to put in some synchronization (usually global reductions),
> which is of
> course far from ideal. Also, even if such solutions appear correct,
> this is
> pretty hard to do on a complex SDAG, and (2) it is almost impossible
> to test its
> correctness as the behavior is pretty stochastic. I believe I have
> run into such
> problems every once in a few months in the past couple of years.
> To remedy this I would like an option to be able to create
> reductions for which
> this is not a problem. Of course, I am willing to trade performance
> for
> correctness for such cases.
> Is such a special reduction feasible? Would that be a lot of work
> for you guys
> to implement and maintain?
>
> We use an approach of creating bound 'shadow' arrays to act as
> independent reduction (sequencing) contexts to address this limitation.
> We've used this approach in a few places in our code, including the
> LiveViz in-situ visualization library and the collision detection
> library.
> In a little more detail, when constructing a chare array, it's possible
> to specify that it should be bound to another existing chare array.
> That means that elements of the same index will always live on the same
> PE. So, you can instantiate some auxiliary arrays, one per reduction
> stream, and bind them to your main computation arrays. Since elements
> with corresponding indices are guaranteed to be co-located, the main
> element can get a pointer to each auxiliary via a ckLocal() call, and
> then call aux->contribute(...) rather than implicitly
> this->contribute(). So, the setup code get a bit more complicated, and
> the code actually invoking the reductions get just a little more
> involved.
> Is that a clear description? Does that approach work for you?

I think that would work and I do use bound arrays for a different purpose.

So how would I have to use this? Here is what I think I need to do: I have to
identify all reductions that can happen in an order that is not necessarily
guaranteed to be always the same and fire them from bound arrays instead (each
from a different chare array)?

J



Archive powered by MHonArc 2.6.19.

Top of Page