Skip to Content.
Sympa Menu

charm - [charm] Question about +bgrecord and +bgreplay

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] Question about +bgrecord and +bgreplay


Chronological Thread 
  • From: Xuehan Xu <xxhdx1985126 AT gmail.com>
  • To: charm AT cs.uiuc.edu
  • Subject: [charm] Question about +bgrecord and +bgreplay
  • Date: Sat, 15 Oct 2011 21:27:01 +0800
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Dear sirs:
       Is the argument "+bgrecord" of bigsim emulator used to let bigsim emulator record only the message ordering of emulated application or record both the message ordering and the message content?
       And is the argument "+bgreplay" used to let the emulator replay the process ran on the processor specific by the number following this argument?

       I ran the Cjacobi3D like the following in the hope that it can record the message content:

./charmrun +p1 ./jacobi 2 2 2 +x2 +y2 +z2 ++remote-shell ssh +bgrecord

       The log files bgfullreplay* was generated.
       Then I ran the following command:
./charmrun +p1 ./jacobi 2 2 2 +x1 +y1 +z1 ++remote-shell ssh +bgreplay 0
       It seems to end normally, but when I ran "./charmrun +p1 ./jacobi 2 2 2 +x1 +y1 +z1 ++remote-shell ssh +bgreplay 2", the following error occurred:
couple@node70 Cjacobi3D]$ ./charmrun +p1 ./jacobi 2 2 2 +x1 +y1 +z1 ++remote-shell ssh +bgreplay 2
Charmrun> started all node programs in 1.177 seconds.
Converse/Charm++ Commit ID: v6.3.0-549-g15fbdcd
Charm++> scheduler running in netpoll mode.
BG info> replay mode for target processor 2.
BG info> Simulating 1x1x1 nodes with 1 comm + 1 work threads each.
BG info> Network type: bluegene.
alpha: 1.000000e-07    packetsize: 1024    CYCLE_TIME_FACTOR:1.000000e-03.
CYCLES_PER_HOP: 5    CYCLES_PER_CORNER: 75.
BG info> cpufactor is 1.000000.
BG info> floating point factor is 0.000000.
BG info> Using WallTimer for timing method.
BgMessageReplay: PE => 2 NumPes => 8 wth:1
------------- Processor 0 Exiting: Caught Signal ------------
Signal: segmentation violation
Suggestion: Try running with '++debug', or linking with '-memory paranoid' (memory paranoid requires '+netpoll' at runtime).
[0] Stack Traceback:
  [0:0] [0x581400]
  [0:1]   [0x8275a71]
  [0:2] CcdCallBacks+0x9e  [0x8276032]
  [0:3] CsdScheduleForever+0xc9  [0x8272515]
  [0:4] CsdScheduler+0x11  [0x8272429]
  [0:5]   [0x827094b]
  [0:6] ConverseInit+0x342  [0x8270e5a]
  [0:7] main+0x34  [0x81a0cf0]
  [0:8] __libc_start_main+0xe6  [0x1a9cc6]
  [0:9]   [0x8135bf1]
Fatal error on PE 0> segmentation violation

   How should I run the program to record the message contents and then replay it?



Archive powered by MHonArc 2.6.16.

Top of Page