Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] Error when running BigNetSim

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] Error when running BigNetSim


Chronological Thread 
  • From: "Mokos, Ryan" <mokos AT illinois.edu>
  • To: Balaji S <balaji.ceg.13 AT gmail.com>
  • Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: Re: [charm] [ppl] Error when running BigNetSim
  • Date: Wed, 29 Feb 2012 18:59:15 +0000
  • Accept-language: en-US
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Hi Balaji,

Thanks for sending your trace files--they definitely are not valid.  Ring generated 4 separate time lines.  Two of them don't have any events in them (you can see this by running the LogAnalyzer tool in the same directory as the bgTrace files), and the other two each have one strange event (e.g., one of them is the receipt of a message from PE 6, which doesn't exist).

I tried running ring the way you did.  One problem is that it appears to create a number of worker threads equal to x*y*z*<number of specified worker threads>.  So when you specify 2 workers, it generates 2x the number of time lines, and all the extras have no events in them.  Furthermore, even when I specify 1 worker, I still get bizarre time lines.  For example:

======================== Run ===========================

./ring 2 1 1 1 1 +bglog
Charm++: standalone mode (not using charmrun)
Converse/Charm++ Commit ID: v6.3.0-1228-g34cdf5e
Charm++> scheduler running in netpoll mode.
BG info> Simulating 2x1x1 nodes with 1 comm + 1 work threads each.
BG info> Network type: bluegene.
alpha: 1.000000e-07     packetsize: 1024        CYCLE_TIME_FACTOR:1.000000e-03.
CYCLES_PER_HOP: 5       CYCLES_PER_CORNER: 75.
BG info> cpufactor is 1.000000.
BG info> floating point factor is 0.000000.
BG info> Using WallTimer for timing method.
BG info> Generating timing log.
0 0 0 => 1 0 0
1 0 0 => 0 0 0
0 0 0 => 1 0 0
[0] Number is numX:2 numY:1 numZ:1 numCth:1 numWth:1 numEmulatingPes:1 totalWorkerProcs:2 bglog_ver:6
[0] Wrote to disk for 2 BG nodes.

BG> BigSim emulator shutdown gracefully!
BG> Emulation took 0.006090 seconds!
Program finished.

======================== Time Line 0 ===========================

[0] 0x618070 name:msgep (srcpe:1 msgID:1) ep:2 charm_ep:-1
 recvtime:0.000010 startTime:0.000010 endTime:-1.000000 execTime:0.000000
backward:
forward:

======================== Time Line 1 ===========================

[0] 0x618070 name:msgep (srcpe:3 msgID:0) ep:2 charm_ep:-1
 recvtime:0.000000 startTime:0.000000 endTime:0.000010 execTime:0.000010
-msgID:1 sent:0.000010 recvtime:0.000010 dstNode:0 tid:-1 size:60 group:1
backward:
forward:

It looks like the simulation is starting with time line 1 (i.e., PE 1), which shouldn't happen.  I'm copying the rest of the development group so someone who knows about this code can take a look at this and see what's wrong.  In any case, I wouldn't use the traces generated by this code right now.

If you just want some trace files to run, you can find some that work at BigNetSim/trunk/sampleTrace.  Alternatively, you could do an AMPI build of charm with the bigemulator target and run one of several examples in charm/net-linux-x86_64/examples/ampi.  The build line (if you're using 64-bit Linux) should look something like:

./build AMPI net-linux-x86_64 bigemulator -j8 -g

Ryan



From: Balaji S [balaji.ceg.13 AT gmail.com]
Sent: Friday, February 24, 2012 9:50 PM
To: Mokos, Ryan
Subject: Re: [ppl] [charm] Error when running BigNetSim

HI sir,
     i have attached the trace files. I got those when i ran,
 ./ring 2 1 1 2 2  +bglog

in the charm/examples/bigsim/emulator directory.
Kindly help sir.

On Fri, Feb 24, 2012 at 2:33 PM, Mokos, Ryan <mokos AT illinois.edu> wrote:
Hi Balaji,

For some reason, it looks like it wasn't able to load the traces properly.  Please send me your bgTrace files so I can see what they contain.

Ryan







Archive powered by MHonArc 2.6.16.

Top of Page