Skip to Content.
Sympa Menu

charm - Re: [charm] Error regarding +bglog

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] Error regarding +bglog


Chronological Thread 
  • From: Jeff Hammond <jeff.science AT gmail.com>
  • To: Ashmita Raju <ashmita.raju AT gmail.com>, charm AT cs.illinois.edu
  • Subject: Re: [charm] Error regarding +bglog
  • Date: Thu, 29 Jun 2017 01:30:06 +0000

The error message contains a tip:

Missing parameters for BlueGene machine size!
<tip> use command line options: +x, +y, or +z.

On Wed, Jun 28, 2017 at 12:19 PM Ashmita Raju <ashmita.raju AT gmail.com> wrote:
Hello,

I am a student from Bangalore,India.

I was running some charm++ programs, in order to analyze the trace logs produced. Any program when i run without +bglog, it executes and the time taken for BigEmulation is shown and a message that it has shut down gracefully comes

When I run it with +bglog, the following error occurs

$ ./charmrun p1 ./jacobi +vp 64 +cth1 +wth1 ++local +bglog
Charmrun> scalable start enabled.
Charmrun> started all node programs in 0.004 seconds.
Converse/Charm
+ Commit ID:
Warning> Randomization of stack pointer is turned on in kernel, thread migration may not work! Run 'echo 0 > /proc/sys/kernel/randomize_va_space' as root to disable it, or try run with '+isomalloc_sync'.
Charm++> scheduler running in netpoll mode.

Missing parameters for BlueGene machine size!
<tip> use command line options: +x, +y, or +z.
[0] Number is numX:0 numY:0 numZ:0 numCth:1 numWth:1 numEmulatingPes:1 totalWorkerProcs:0 bglog_ver:6
[0] Wrote to disk for 0 BG nodes.
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: Converse zero handler executed-- was a message corrupted?

[0] Stack Traceback:
[0:0] CmiAbort+0x5d [0x55beefac081e]
[0:1] +0x2cd7b5 [0x55beefac77b5]
[0:2] CmiHandleMessage+0x58 [0x55beefac7d8c]
[0:3] CsdScheduleForever+0xd6 [0x55beefac806c]
[0:4] CsdScheduler+0x16 [0x55beefac7f74]
[0:5] CmiDeliverMsgs+0x15 [0x55beefac7dab]

[0:6] BgShutdown+0x72 [0x55beef9ae9c6]
[0:7] +0x1ac505 [0x55beef9a6505]
[0:8] _Z6bgMainiPPc+0xf53 [0x55beef9a759a]
[0:9] +0x2cb9f1 [0x55beefac59f1]
[0:10] ConverseInit+0x3e3 [0x55beefac5fb2]
[0:11] main+0x2f [0x55beef9af319]
[0:12] __libc_start_main+0xf1 [0x7f87841923f1]
[0:13] _start+0x2a [0x55beef94cdba]
Fatal error on PE 0> Converse zero handler executed-- was a message corrupted?


Could you please give me some insight into what the problem could be?


Thank You

Ashmita Raju

--
Jeff Hammond
jeff.science AT gmail.com
http://jeffhammond.github.io/



Archive powered by MHonArc 2.6.19.

Top of Page