Skip to Content.
Sympa Menu

charm - [charm] All-to-all or redn+bcast

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] All-to-all or redn+bcast


Chronological Thread 
  • From: Jozsef Bakosi <jbakosi AT lanl.gov>
  • To: charm AT lists.cs.illinois.edu
  • Subject: [charm] All-to-all or redn+bcast
  • Date: Fri, 14 May 2021 09:51:30 -0600
  • Authentication-results: ppops.net; spf=pass smtp.mailfrom=jbakosi AT lanl.gov; dkim=pass header.s=lanl header.d=lanl.gov

Hi folks,

I wanted to know your expert opinion on the following.

We have an all-to-all, computing a min of single scalar real value,
among many chares intended to be running at large scales. This amounts
to our single synchronization point within a time step.

I wonder if replacing the single all-to-all with a reduction + broadcast
targeting each chare may allow for more overlap. I believe a single
all-to-all is implemented as a redn+bcast to/from a single chare, and
the complexity of what I'm suggesting is probably worse, nevertheless
worth asking.

In code, with DG being a chare array, I'm suggesting to replace

contribute( sizeof(double), &mindt, CkReduction::min_double,
CkCallback(CkReductionTarget(DG,solve), thisProxy) );

with

for all DG chares i
contribute( sizeof(double), &mindt, CkReduction::min_double,
CkCallback(CkReductionTarget(DG,solve), thisProxy[i]) );
end

Would this allow for more overlap by removing the global sync or I would
throw the baby out with the bathwater because I am replacing the log(n)
algorithmic/parallel complexity with n due to the for loop?

Thanks,
Jozsef
--
Jozsef Bakosi, PhD, LANL CCS-2, o:505-665-0950, c:505-695-4523



Archive powered by MHonArc 2.6.19.

Top of Page