Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] ibverbs won't run standalone

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] ibverbs won't run standalone


Chronological Thread 
  • From: "Kale, Laxmikant V" <kale AT illinois.edu>
  • To: "Zheng, Gengbin" <gzheng AT illinois.edu>, Tom Quinn <trq AT astro.washington.edu>
  • Cc: "charm AT cs.illinois.edu" <charm AT cs.illinois.edu>
  • Subject: Re: [charm] [ppl] ibverbs won't run standalone
  • Date: Mon, 9 Jul 2012 23:11:07 +0000
  • Accept-language: en-US
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

I am curious about this: how did such a bug (in a feature used by probably
every program? CmiFree) remain undiscovered?
I am looking for a "process" level answer. I.e. Was this machine layer not
in our standard autobuild suite?


--
Laxmikant (Sanjay) Kale http://charm.cs.uiuc.edu
<http://charm.cs.uiuc.edu/>
Professor, Computer Science
kale AT illinois.edu
201 N. Goodwin Avenue Ph: (217) 244-0094
Urbana, IL 61801-2302 FAX: (217) 265-6582






On 7/8/12 6:03 PM, "Gengbin Zheng"
<gzheng AT illinois.edu>
wrote:

>Looks like you are running in standalone mode of SMP.
>I don't think it is thoroughly tested, but I just fixed the CmiFree
>for standalone mode. Please try again.
>
>Gengbin
>
>
>On Sun, Jul 8, 2012 at 5:49 PM, Tom Quinn
><trq AT astro.washington.edu>
>wrote:
>> I'm running an ibverbs standalone for debugging, but I don't get very
>>far
>> because of a segfault. valgrind reports the following.
>>
>> Line 2529 of machine-ibverbs.c has a simple
>>
>> if (Cmi_charmrun_fd == -1) return malloc(size);
>>
>> whereas infi_CmiFree() assumes the pointer has been advanced to allow
>>for
>> the header.
>>
>> Charm++> Running on 1 unique compute nodes (16-way SMP).
>> ==10120== Invalid read of size 8
>> ==10120== at 0x77CB4E: infi_CmiFree (machine-ibverbs.c:2641)
>> ==10120== by 0x7843E6: CmiFree (convcore.c:2945)
>> ==10120== by 0x79E287: cpuTopoHandler(void*) (cputopology.C:265)
>> ==10120== by 0x784DB2: CmiSendReduce (convcore.c:2363)
>> ==10120== by 0x784C1B: CmiGlobalReduce (convcore.c:2383)
>> ==10120== by 0x78478C: CmiReduce (convcore.c:2405)
>> ==10120== by 0x79FE15: LrtsInitCpuTopo (cputopology.C:559)
>> ==10120== by 0x79F805: CmiInitCPUTopology (cputopology.C:629)
>> ==10120== by 0x693B4D: _initCharm(int, char**) (init.C:1264)
>> ==10120== by 0x780EC4: ConverseInit (machine.c:2881)
>> ==10120== by 0x6B0566: main (main.C:18)
>> ==10120== Address 0x5a38648 is 8 bytes before a block of size 68
>>alloc'd
>> ==10120== at 0x4A0776F: malloc (vg_replace_malloc.c:263)
>> ==10120== by 0x77CA1D: infi_CmiAlloc (machine-ibverbs.c:2529)
>> ==10120== by 0x784320: CmiAlloc (convcore.c:2844)
>> ==10120== by 0x79FDB0: LrtsInitCpuTopo (cputopology.C:550)
>> ==10120== by 0x79F805: CmiInitCPUTopology (cputopology.C:629)
>> ==10120== by 0x693B4D: _initCharm(int, char**) (init.C:1264)
>> ==10120== by 0x780EC4: ConverseInit (machine.c:2881)
>> ==10120== by 0x6B0566: main (main.C:18)
>>
>>
>> Tom Quinn Astronomy, University of Washington
>> Internet:
>> trq AT astro.washington.edu
>> Phone: 206-685-9009
>> _______________________________________________
>> charm mailing list
>> charm AT cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/charm
>> _______________________________________________
>> ppl mailing list
>> ppl AT cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/ppl
>_______________________________________________
>charm mailing list
>charm AT cs.uiuc.edu
>http://lists.cs.uiuc.edu/mailman/listinfo/charm
>_______________________________________________
>ppl mailing list
>ppl AT cs.uiuc.edu
>http://lists.cs.uiuc.edu/mailman/listinfo/ppl






Archive powered by MHonArc 2.6.16.

Top of Page