Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] Failed to run the example code ---- Cjacobi3D

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] Failed to run the example code ---- Cjacobi3D


Chronological Thread 
  • From: Gengbin Zheng <zhenggb AT gmail.com>
  • To: Xuehan Xu <xxhdx1985126 AT gmail.com>
  • Cc: charm AT cs.uiuc.edu
  • Subject: Re: [charm] [ppl] Failed to run the example code ---- Cjacobi3D
  • Date: Wed, 30 Mar 2011 09:53:43 -0500
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

This appears to be a bug in SMP version of bigsim, which I just fixed
in the development branch.
But in general since bigsim emulates multiple target nodes in one
process, which by nature is like SMP. It is recommended that you
don't build SMP version of charm++ for bigemulator (it is not tested
on SMP as often as other charm versions).

in your case, build charm as:

./build bgampi net-linux bigemulator -g -O0


Gengbin

On Wed, Mar 30, 2011 at 9:01 AM, Xuehan Xu
<xxhdx1985126 AT gmail.com>
wrote:
> Dear sirs:
>        I tried to run the example code Cjacobi3D on bigsim, and came up with
> the following error:
>
> [root@localhost
> Cjacobi3D]# ./charmrun jacobi +x2 +y2 +z2 ++remote-shell ssh
> Charmrun> started all node programs in 10.196 seconds.
> Warning> Randomization of stack pointer is turned on in kernel, thread
> migration may not work! Run 'echo 0 > /proc/sys/kernel/randomize_va_space'
> as root to disable it, or try run with '+isomalloc_sync'.
> Charm++: scheduler running in netpoll mode.
> woailaopo
> woailaopo
> BG info> Simulating 2x2x2 nodes with 1 comm + 1 work threads each.
> BG info> Network type: bluegene.
> alpha: 1.000000e-07     packetsize: 1024
> CYCLE_TIME_FACTOR:1.000000e-03.
> CYCLES_PER_HOP: 5       CYCLES_PER_CORNER: 75.
> BG info> cpufactor is 1.000000.
> BG info> floating point factor is 0.000000.
> BG info> Using WallTimer for timing method.
> LB> Load balancer ignores processor background load.
> ------------- Processor 0 Exiting: Caught Signal ------------
> Signal: segmentation violation
> Suggestion: Try running with '++debug', or linking with '-memory paranoid'
> (memory paranoid requires '+netpoll' at runtime).
> [0] Stack Traceback:
>   [0:0] [0x968420]
>   [0:1] _ZN10CkArrayMap6homePeEiRK12CkArrayIndex+0x26  [0x81bb6c6]
>   [0:2] _ZNK8CkLocMgr6homePeERK12CkArrayIndex+0x39  [0x81bb7ff]
>   [0:3] _ZN8CkLocMgr10informHomeERK12CkArrayIndexi+0x19  [0x81b5ac3]
>   [0:4] _ZN8CkLocMgr11createLocalERK12CkArrayIndexbbb+0xc7  [0x81ba039]
>   [0:5]
> _ZN8CkLocMgr10addElementE9CkArrayIDRK12CkArrayIndexP12CkMigratableiPv+0x99
> [0x81ba4b1]
>   [0:6] _ZN7CkArray13insertElementEP9CkMessage+0x139  [0x81c52d9]
>   [0:7] _ZN7CkArray13insertInitialERK12CkArrayIndexPvi+0x5f  [0x81c55d5]
>   [0:8] _ZN8BlockMap15populateInitialEiR15CkArrayIndexMaxPvP8CkArrMgr+0x1c1
> [0x81beecb]
>   [0:9] _ZN8CkLocMgr15populateInitialER15CkArrayIndexMaxPvP8CkArrMgr+0x47
> [0x81c8b73]
>   [0:10]
> _ZN7CkArrayC1ER14CkArrayOptionsR19CkMarshalledMessage10_ckGroupID+0x31e
> [0x81c5f34]
>   [0:11] _ZN15CkIndex_CkArray23_call_CkArray_marshall1EPvP7CkArray+0xc0
> [0x81c6a88]
>   [0:12] CkDeliverMessageFree+0x44  [0x8199f06]
>   [0:13]
> /root/laoposim/charm/net-linux-bigemulator-smp/examples/ampi/Cjacobi3D/jacobi
> [0x8199fa6]
>   [0:14] CkCreateLocalGroup+0x319  [0x819bba9]
>   [0:15] _Z12_createGroup10_ckGroupIDP8envelope+0x22c  [0x819c268]
>   [0:16]
> /root/laoposim/charm/net-linux-bigemulator-smp/examples/ampi/Cjacobi3D/jacobi
> [0x819c32e]
>   [0:17] CkCreateGroup+0x153  [0x819c48b]
>   [0:18]
> _ZN14CProxy_CkArray5ckNewERK14CkArrayOptionsRK19CkMarshalledMessageRK10_ckGroupIDPK14CkEntryOptions+0x119
> [0x81c6f1d]
>   [0:19]
> _ZN16CProxy_ArrayBase13ckCreateArrayEP14CkArrayMessageiRK14CkArrayOptions+0xff
> [0x81c7053]
>   [0:20]
> _ZN19CProxy_ArrayElement13ckCreateArrayEP14CkArrayMessageiRK14CkArrayOptions+0x2a
> [0x814212e]
>   [0:21]
> _ZN13CProxy_TCharm13ckCreateArrayEP14CkArrayMessageiRK14CkArrayOptions+0x2a
> [0x816a94a]
>   [0:22] _ZN13CProxy_TCharm5ckNewEP13TCharmInitMsgRK14CkArrayOptions+0x2e
> [0x81661c6]
>   [0:23]
> /root/laoposim/charm/net-linux-bigemulator-smp/examples/ampi/Cjacobi3D/jacobi
> [0x8166e28]
>   [0:24] TCHARM_Create_data+0x82  [0x8169bdc]
>   [0:25] _Z14ampiCreateMainPFviPPcEPKci+0x77  [0x8136dc7]
>   [0:26] AMPI_Setup_Switch+0x40  [0x8136f9a]
>   [0:27] TCHARM_Call_fallback_setup+0x16  [0x81659bc]
>   [0:28] TCHARM_User_setup+0xb  [0x828e987]
>   [0:29] tcharm_user_setup_+0xb  [0x828e96f]
>   [0:30] _ZN10TCharmMainC1EP8CkArgMsg+0x31  [0x8117e2d]
>   [0:31]
> _ZN18CkIndex_TCharmMain25_call_TCharmMain_CkArgMsgEPvP10TCharmMain+0x2f
> [0x8117adf]
>   [0:32] _Z10_initCharmiPPc+0xda1  [0x818e24d]
>   [0:33] _ZN14workThreadInfo3runEv+0x347  [0x8174371]
>   [0:34] _Z10run_threadP10threadInfo+0x2e  [0x816b812]
>   [0:35]
> /root/laoposim/charm/net-linux-bigemulator-smp/examples/ampi/Cjacobi3D/jacobi
> [0x8187f04]
>   [0:36]
> /root/laoposim/charm/net-linux-bigemulator-smp/examples/ampi/Cjacobi3D/jacobi
> [0x828e7ad]
>   [37] Charm++ Runtime: Converse thread (qt_args+0x70  [0x828e83e])
> Fatal error on PE 0> segmentation violationde
>
> I built the system like this:
> ./build bgampi net-linux bigemulator smp -g -O0
>
> I debugged the program, and it seems that the problem exists at the line 222
> in cklocation.C. The pointer Cpv_startedEvac_ is NULL when being
> derefernced.
> How should I deal with it ? And what is the pointer used for?
>
> Thanks everyone.
>
> _______________________________________________
> charm mailing list
> charm AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/charm
>
> _______________________________________________
> ppl mailing list
> ppl AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/ppl
>
>





Archive powered by MHonArc 2.6.16.

Top of Page