Skip to Content.
Sympa Menu

charm - Re: [charm] Custom reduction

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] Custom reduction


Chronological Thread 
  • From: Jozsef Bakosi <jbakosi AT gmail.com>
  • To: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: Re: [charm] Custom reduction
  • Date: Tue, 19 Jan 2016 15:33:47 -0700

Hmm. Fixed it. Here is how I faked myself out, for those interested (and for myself later):

The CkReduction::reducerType variable, the result of registering the custom reducer with the runtime system, and the one needed at the contribute() call-site was stored in a static variable in a header file. Since static variables are local to the compilation unit, and since the header file in question was included at least twice, there were two different "versions" of the variable at link time (with the same name in the same namespace, but at different memory locations). As a result, the registering happened to one, while the contribute() call used the other one. Hence the merger function was never called.

Interestingly, I have the same exact setup for another custom reduction, the same up to the point described in the above paragraph, which works fine. The only difference is that with this second custom reduction the contribute() calls are in the .C file, while in the case that does not work all accesses to the variable in question are in a .h file (since that operates inside a templated class). Thus, in the problematic case the contribute call must have used an unregistered custom reducer, even though all accesses to the reducer variable are in the same header file. This is what is subtle: the contribute() as well as the end-function of the reduction used the same instance, but that was not the same as was registered during the initnode call during setup. I verified this latter by taking out the CkReduction::addReducer() call from the fixed code, which reproduced the same error (which, BTW, can be pretty hard to decipher, because it varies, depending on the number of PEs, between running fine and producing garbage and segfault, a typical memory error and undefined behavior).

I guess this is a nice example of how to shoot yourself in the foot with C++ templates. ;-)

In summary, I guess, the easiest is to make sure the result of CkReduction::addReducer() and what is used in contribute() later is the same. I made it extern in the header file and defined it (once) in a .C file.

On Tue, Jan 19, 2016 at 2:33 PM, Jozsef Bakosi <jbakosi AT gmail.com> wrote:
Hi folks,

I'm doing a custom reduction, very similar to what we discussed back in July: https://lists.cs.illinois.edu/lists/arc/charm/2015-07/msg00035.html. Following that example, I have successfully done custom reductions in my own code.

Now I'm trying to add another one, just like the above example, but my merge function, equivalent to findHigherClass() in the above example, is never called.

I have tested the registering part, equivalent to initnode registerClassRed() in the example, which is called fine. I have tested the contribute() call, called fine. I have also tested the end of the reduction, equivalent to the printMaxRank(CkReductionMsg *msg), called fine. Of course, the message appearing in my equivalent of printMatrixRank() is garbage, because my merger function is never called. I have also looked at the generated decl.h and def.h files and they seem fine, compared to other custom reductions in my code. The only thing missing is that the merger function, apparently successfully registered in the initnode function, is never called.

Can someone think of a good way to debug this?

Thanks,
Jozsef




Archive powered by MHonArc 2.6.16.

Top of Page