Skip to Content.
Sympa Menu

charm - Re: [charm] Use of uninitialised value of size 8 from CkReductionMsg::buildNew()

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] Use of uninitialised value of size 8 from CkReductionMsg::buildNew()


Chronological Thread 
  • From: Phil Miller <mille121 AT illinois.edu>
  • To: Jozsef Bakosi <jbakosi AT gmail.com>
  • Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: Re: [charm] Use of uninitialised value of size 8 from CkReductionMsg::buildNew()
  • Date: Tue, 22 Nov 2016 22:40:37 -0600

Some clarifying/exploratory questions (which should be pretty generally applicable):
- Which system do you observe this on?
- What build configuration are you using on the Cray system? gni-crayxc? mpi-crayxc? smp or no? The full ./build command line would be useful, along with output from 'module list'
- Outside of valgrind, do you otherwise observe failures on the Cray?
- Can you reproduce this with a maximally simplified build on the Cray ? E.g. without smp, and on whichver network layer (gni or mpi) you're not currently using?
- How many nodes and PEs does this take to reproduce? How few can you use? Does it reproduce on just 1 PE?
- Can you reproduce this in a smaller, simplified test program? Alternately, can you point us to the code in your repository and a set of inputs and command line arguments that reproduces it?

On Tue, Nov 22, 2016 at 10:25 AM, Jozsef Bakosi <jbakosi AT gmail.com> wrote:
Hi folks,

I'm getting the following valgrind message only on Cray (no problem on, e.g., linux/mac):

==48771== Use of uninitialised value of size 8 
==48771==    at 0x21696E61: memcpy (memcpy.S:201)
==48771==    by 0x212E12E0: CkReductionMsg::buildNew(int, void const*, CkReduction::reducerType, CkReductionMsg*) 
==48771==    by 0x212EF68A: Group::contribute(int, void const*, CkReduction::reducerType, CkCallback const&, unsigned short)

This is from a chare group reduction of an array of doubles with CkReduction::sum_double.  There is a single memcpy() in src/ck-core/ckreduction.C:1501:

    memcpy(ret->data,srcData,NdataSize);

I am suspecting the memory size allocated behind srcData is smaller (by a single double) than NdataSize, probably because I'm feeding the wrong data size. The way I feed the data size to the contribute call is  via static_cast<int>( vec.size() * sizeof(double) ), and the data pointer is vec.data(), which I assume ends up being passed on to be srcData. Here vec is a std::vector<double>. I believe, this should be correct, but for some reason this is only a segfault on cray - valgrind does not even complain on linux or mac.

Does any have an idea how I can debug this?

Thanks,
Jozsef






Archive powered by MHonArc 2.6.19.

Top of Page