Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] Many individual chares vs chare array

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] Many individual chares vs chare array


Chronological Thread 
  • From: Jonathan Lifflander <jliffl2 AT illinois.edu>
  • To: Jozsef Bakosi <jbakosi AT gmail.com>
  • Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: Re: [charm] [ppl] Many individual chares vs chare array
  • Date: Fri, 10 Jul 2015 14:45:24 -0500
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

The "reductiontarget" keyword enables the annotated entry method to be
the target of a reduction (the method that is called when the
reduction is finished). It does not change how aggregation happens
under the hood.

In order to get a reduction tree + aggregation, you need to use a chare array:

array [1d] { ... }

When you call contribute from the elements of the chare array, a
reduction tree is used along with local aggregation of individual
contributions inside the node.

Calling contribute from an array requires that all the members
contribute. You will need to create a section if that is not the case.

Thanks,

Jonathan

On 10 July 2015 at 09:51, Jozsef Bakosi
<jbakosi AT gmail.com>
wrote:
> Follow-up quesion:
>
> Does the aggregation happen only with reductions, via, e.g., contribute(
> CkCallback( CkReductionTarget( Host, hostfn ), hostinstance )), or also with
> simply calling the non-reductiontarget member function, hostfn_noreduct(),
> from the workers? The host .ci file in that case would be
>
> chare Host {
> entry [reductiontarget] void hostfn();
> entry void hostfn_noreduct();
> };
>
> Again, I suspect, the non-reduct will not aggregate, but I might be wrong.
> Can you calirfy?
>
> On Thu, Jul 9, 2015 at 1:01 PM, Jozsef Bakosi
> <jbakosi AT gmail.com>
> wrote:
>>
>> Thanks Phil, that's interesting. I guess that (at least partially)
>> explains (I hope) the pretty unsatisfactory weak scaling behavior I'm
>> getting with a simple particle (i.e., Monta Carlo) code.
>>
>> Thanks for the clarification,
>> J
>>
>> On Thu, Jul 9, 2015 at 12:35 PM, Phil Miller
>> <mille121 AT illinois.edu>
>> wrote:
>>>
>>> You're exactly right - reductions locally combine the contributions of
>>> all chare array elements on each PE, and then in each process, and
>>> transmit
>>> a single message up a process tree to the root. At large machine scales,
>>> this isn't just faster, it's the difference between the code running at
>>> all,
>>> and crashing due to a message overload.
>>>
>>> On Thu, Jul 9, 2015 at 1:10 PM, Jozsef Bakosi
>>> <jbakosi AT gmail.com>
>>> wrote:
>>>>
>>>> Hi folks,
>>>>
>>>> I suspect I know the answer to this question but I'd like some
>>>> clarification on it.
>>>>
>>>> What is the main difference between creating (a potentially large number
>>>> of) individual chares and those calling back to a single host proxy or
>>>> creating the workers instead as a chare array and using reduction. I
>>>> assume
>>>> the latter will do some kind of message aggregation under the hood (i.e.,
>>>> using a tree) and collect messages (in the form of an entry method
>>>> arguments) from individual array elements and send only aggregated
>>>> messages
>>>> to the single host. Is this correct? If so, I guess, I should get better
>>>> performance...
>>>>
>>>> Thanks,
>>>> Jozsef
>>>>
>>>> _______________________________________________
>>>> charm mailing list
>>>> charm AT cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/charm
>>>>
>>>
>>
>
>
> _______________________________________________
> charm mailing list
> charm AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/charm
>
> _______________________________________________
> ppl mailing list
> ppl AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/ppl
>




Archive powered by MHonArc 2.6.16.

Top of Page