Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] About Statistics for Links Using BigNetSim

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] About Statistics for Links Using BigNetSim


Chronological Thread 
  • From: "Mokos, Ryan" <mokos AT illinois.edu>
  • To: "guangjunster AT gmail.com" <guangjunster AT gmail.com>
  • Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: Re: [charm] [ppl] About Statistics for Links Using BigNetSim
  • Date: Thu, 16 Aug 2012 17:33:51 +0000
  • Accept-language: en-US
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Hi Qin,

I'm a little surprised that printf works for you while CkPrintf doesn't, as I
thought CkPrintf is basically just a wrapper for printf, but I don't know
that much about the internal workings of Ck stuff.

As for your nicStart question...

Each network object in BigNetSim (proc, node, NIC, switch, channel) is
represented by a POSE object (called a poser). All POSE objects for a
simulation are stored in a single 1D array (POSE_Objects). nicStart (along
with NodeStart, ChannelStart, and switchStart, all defined in Main/util.h) is
an offset for indexing the NICs in the POSE_Objects array. BigNetSim has two
modes of operation:

(1) Trace-based simulation: bgTrace files (PE time lines) are used drive the
simulation. In this mode, all network components are placed in POSE_Objects
in this order:

all BGprocs
all BGnodes
all NICs
all Switches
all Channels

(2) Traffic pattern simulation: each node is replaced by a Transceiver poser,
which injects messages into the network. There are no processors in these
simulations, so the POSE_Objects array looks like this:

all Transceivers
all NICs
all Switches
all Channels

As you probably know, you are using mode (2). Since there is one Transceiver
per node, the number of nodes (config.numNodes) is used to initialize the
offset that indicates where in the POSE_Objects array the NICs begin
(config.nicStart).

Ryan

________________________________________
From:
guangjunster AT gmail.com

[guangjunster AT gmail.com]
Sent: Wednesday, August 15, 2012 10:02 PM
To: Mokos, Ryan
Subject: RE: [ppl] [charm] About Statistics for Links Using BigNetSim

Hi, Ryan
Thank you!
You are right.I spelled it wrong in step #2,and it should be
net-linux.
I traced the running process.
1. POSE_GlobalClock is always zero so that The utilization %
calculation are "nan"s and "inf"s.You explain the reasons.
2. About the throughput overflow,I found out it is all because of
"CkPrintf".If I change it into "printf",it is OK.I guess that the
reasons is because of my 32-bit system.Just as you say.
I will try it in a 64-bit system.

My another question is why the "nicStart" is initialized by
config.nicStart = config.numNodes in the
BigNetSim/trunk/main/pgm.c,and
why it is not zero.
Thank you again.
your sincerely
Qin
in 2012-08-15三的 22:09 +0000,Mokos, Ryan wrote:
> Hi Guangjun,
>
> I'm a little confused by your charm build lines in step #2. I don't think
> net-linux-x86 is a valid build target, and the target for both build lines
> should be the same. What system are you using?
>
> I pretty much followed the same procedure you did and it worked fine for
> me. These are my exact steps:
>
> charm build (I added -j8 to speed up the charm build since the machine I
> used has 8 cores)
> ---------------
> git clone charmgit:charm
> cd charm
> ./build bgampi net-linux-x86_64 -j8 -O2
> ./build pose net-linux-x86_64 -j8 -O2
>
> BigNetSim build
> ---------------------
> svn co https://charm.cs.uiuc.edu/svn/repos/BigNetSim
> modify BigNetSim/trunk/Makefile.common to point CHARMBASE to the charm
> build above (/charm/net-linux-x86_64)
> cd BigNetSim/trunk/BlueGene
> make
> cd ../tmp
> cp ../BlueGene/netconfig . [I would use netconfig instead of netconfig.vc
> as there are more options in it]
> in netconfig, change:
> USE_TRANSCEIVER 0->1
> COLLECTION_INTERVAL 1000000->10000000
> DISPLAY_LINK_STATS 0->1
> DISPLAY_MESSAGE_DELAY 0->1 [I'm not sure this does anything in the
> current version]
> ./bigsimulator 1 2 1 10 64 0.1
>
> Charm++: standalone mode (not using charmrun)
> Converse/Charm++ Commit ID: v6.4.0-633-g8310d51
> Charm++> scheduler running in netpoll mode.
> CharmLB> Load balancer assumes all CPUs are same.
> Error> Open failed with bgTrace.
> Charm++> Running on 1 unique compute nodes (8-way SMP).
> Charm++> cpu topology info is gathered in 0.000 seconds.
> ================= Simulation Configuration =================
> Number of physical PEs: 1
> POSE mode: Parallel
> Network model: BlueGene
> Command line: ./bigsimulator 1 2 1 10 64 0.1
> Timing factor: 1.000000e+08 (i.e., 1 GVT tick = 10 ns)
> cpufactor: 1.000000
> Transceiver parameters: mode=contention traffic=poisson pattern=1
> numMessages=10 msgSize=64 loadFactor=0.100000
> Simulation mode: traffic pattern generator (Transceiver)
> Simulation network mode: full contention
> Initializing POSE...
> POSE initialization complete.
> Using Inactivity Detection for termination.
> Network parameters:
> Max packet size: 256
> Number of buffers per port in each switch: 12
> Switch buffer size: 1024
> Channel bandwidth: 1.000000
> Channel delay: 0
> Link stats collection interval: 10000000 GVT ticks
> Link stats on: yes
> Message stats on: yes
> Adaptive routing on: yes
> Header size: 16 bytes
> Processor send overhead: 0 GVT ticks
> Processor receive overhead: 0 GVT ticks
> Number of simulated nodes: 64
> ============================================================
> Simulation inactive at time: 45427
> Final GVT = 45427
>
> Channel 0: final ovt: 0, utilization time: 0, utilization %: nan, number of
> times utilized: 0
> Channel 1: final ovt: 0, utilization time: 0, utilization %: nan, number of
> times utilized: 0
> Channel 2: final ovt: 0, utilization time: 0, utilization %: nan, number of
> times utilized: 0
> Channel 3: final ovt: 34910, utilization time: 800, utilization %: inf,
> number of times utilized: 10
> Channel 4: final ovt: 35242, utilization time: 800, utilization %: inf,
> number of times utilized: 10
> Channel 5: final ovt: 18675, utilization time: 800, utilization %: inf,
> number of times utilized: 10
> Channel 6: final ovt: 18675, utilization time: 800, utilization %: inf,
> number of times utilized: 10
> Channel 7: final ovt: 5963, utilization time: 800, utilization %: inf,
> number of times utilized: 10
> Channel 8: final ovt: 0, utilization time: 0, utilization %: nan, number of
> times utilized: 0
> Channel 9: final ovt: 0, utilization time: 0, utilization %: nan, number of
> times utilized: 0
> Channel 10: final ovt: 34910, utilization time: 1600, utilization %: inf,
> number of times utilized: 20
> Channel 11: final ovt: 17461, utilization time: 800, utilization %: inf,
> number of times utilized: 10
> Channel 12: final ovt: 22475, utilization time: 800, utilization %: inf,
> number of times utilized: 10
> Channel 13: final ovt: 22475, utilization time: 800, utilization %: inf,
> number of times utilized: 10
> Channel 14: final ovt: 12075, utilization time: 1600, utilization %: inf,
> number of times utilized: 20
> Channel 15: final ovt: 0, utilization time: 0, utilization %: nan, number
> of times utilized: 0
> Channel 16: final ovt: 0, utilization time: 0, utilization %: nan, number
> of times utilized: 0
> Channel 17: final ovt: 20237, utilization time: 800, utilization %: inf,
> number of times utilized: 10
> Channel 18: final ovt: 20788, utilization time: 800, utilization %: inf,
> number of times utilized: 10
> Channel 19: final ovt: 27903, utilization time: 800, utilization %: inf,
> number of times utilized: 10
> Channel 20: final ovt: 27903, utilization time: 800, utilization %: inf,
> number of times utilized: 10
> Channel 21: final ovt: 12075, utilization time: 800, utilization %: inf,
> number of times utilized: 10
> ...
> Time 18675 The throughput at node 0 is 0.034270 MB/sec
> Time 22475 The throughput at node 1 is 0.028476 MB/sec
> Time 34830 The throughput at node 2 is 0.018375 MB/sec
> Time 29793 The throughput at node 3 is 0.021482 MB/sec
> Time 17980 The throughput at node 4 is 0.035595 MB/sec
> Time 33212 The throughput at node 5 is 0.019270 MB/sec
> Time 35162 The throughput at node 6 is 0.018201 MB/sec
> Time 26156 The throughput at node 7 is 0.024469 MB/sec
> Time 37176 The throughput at node 8 is 0.017215 MB/sec
> Time 26310 The throughput at node 9 is 0.024325 MB/sec
> Time 18467 The throughput at node 10 is 0.034656 MB/sec
> ...
> Final basic stats: Commits: 23728 Rollbacks: 20
> Final basic stats: GVT iterations: 2797
> 1 PE Simulation finished at 1.843324.
>
>
> Maybe there's an issue if you're using a 32-bit system? Most of the
> development work on BigNetSim in the last several years has been done on
> 64-bit systems, and most print statements are designed for use on 64-bit
> systems.
>
> I also find it odd that we get different ending virtual times: 25,762 GVT
> ticks for you and 45,427 for me. They should be the same.
>
> The utilization % calculation tries to get the final GVT for the
> calculation in a lazy way that doesn't work with the parallel BigNetSim
> build, hence the "nan"s and "inf"s. You can calculate it yourself (=
> utilization time / final GVT), or you can build the sequential version of
> BigNetSim by adding SEQUENTIAL=1 to the make line like so:
>
> cd BigNetSim/trunk/BlueGene
> make SEQUENTIAL=1
>
> This can only be run on one physical core, but it will give you the
> utilization percentages.
>
> Please try building and running exactly as I did above if you have access
> to a 64-bit Linux machine and see if you still have problems.
>
> Ryan
>
>
> ________________________________________
> From:
> ppl-bounces AT cs.uiuc.edu
>
> [ppl-bounces AT cs.uiuc.edu]
> on behalf of
> guangjunster AT gmail.com
>
> [guangjunster AT gmail.com]
> Sent: Saturday, August 11, 2012 4:59 AM
> To:
> charm AT cs.uiuc.edu
> Subject: [ppl] [charm] About Statistics for Links Using BigNetSim
>
> Hello all,
>
> I want collect statistics about links,such as throughput,utilization
> rate,latency,but I can not achieve.
>
> I install bignetsim following the below.
> 1. Clons charm and bignetsim by svn
> 2. ./build bgampi net-linux -O2
> ./build pose net-linux-x86 -O2
> 3. make BlueGene in the directory
> BigNetSim/trunk/BlueGene
> They are OK!
> I copy the netconfig.vc file to the tmp director,and name it as
> netconfig.Then,I run a simulation
> ./bigsimulator 1 2 1 10 64 0.1
> It's ok,and no error.
> I continue to modify the netconfig file.
> COLLECTION_INTERVAL 1000000->10000000
> DISPLAY_LINK_STATS 0->1
> DISPLAY_MESSAGE_DELAY 0->1
> And others keep unchanging.
>
> Run again,but I get the following results.
>
> harm++: standalone mode (not using charmrun)
> Converse/Charm++ Commit ID: v6.4.0-632-gd54645a
> Charm++> scheduler running in netpoll mode.
> CharmLB> Load balancer assumes all CPUs are same.
> Error> Open failed with bgTrace.
> Charm++> Running on 1 unique compute nodes (2-way SMP).
> Charm++> cpu topology info is gathered in 0.001 seconds.
> ================= Simulation Configuration =================
> Number of physical PEs: 1
> POSE mode: Parallel
> Network model: BlueGene
> Command line: ./bigsimulator 1 2 1 10 64 0.1
> Timing factor: 1.000000e+08 (i.e., 1 GVT tick = 10 ns)
> cpufactor: 1.000000
> Transceiver parameters: mode=contention traffic=poisson pattern=1
> numMessages=10 msgSize=64 loadFactor=0.100000
> Simulation mode: traffic pattern generator (Transceiver)
> Simulation network mode: full contention
> Initializing POSE...
> POSE initialization complete.
> Using Inactivity Detection for termination.
> Network parameters:
> Max packet size: 256
> Number of buffers per port in each switch: 16
> Switch buffer size: 256
> Channel bandwidth: 1.750000
> Channel delay: 9
> Link stats collection interval: 10000000 GVT ticks
> Link stats on: yes
> Message stats on: yes
> Adaptive routing on: yes
> Header size: 16 bytes
> Processor send overhead: 50 GVT ticks
> Processor receive overhead: 50 GVT ticks
> Number of simulated nodes: 64
> ============================================================
> ......
> Channel 30: final ovt: 0, utilization time: 0, utilization %: -nan,
> number of times utilized: 0
> Channel 31: final ovt: 19997, utilization time: 460, utilization %: inf,
> number of times utilized: 10
> Channel 32: final ovt: 10796, utilization time: 460, utilization %: inf,
> number of times utilized: 10
> Channel 33: final ovt: 9854, utilization time: 460, utilization %: inf,
> number of times utilized: 10
> Channel 34: final ovt: 9867, utilization time: 460, utilization %: inf,
> number of times utilized: 10
> Channel 35: final ovt: 10568, utilization time: 460, utilization %: inf,
> number of times utilized: 10
> Channel 36: final ovt: 0, utilization time: 0, utilization %: -nan,
> number of times utilized: 0
> Channel 37: final ovt: 0, utilization time: 0, utilization %: -nan,
> number of times utilized: 0
> Channel 38: final ovt: 19984, utilization time: 920, utilization %: inf,
> number of times utilized: 20
> Channel 39: final ovt: 9339, utilization time: 460, utilization %: inf,
> number of times utilized: 10
> Channel 40: final ovt: 13418, utilization time: 460, utilization %: inf,
> number of times utilized: 10
> Channel 41: final ovt: 13431, utilization time: 460, utilization %: inf,
> number of times utilized: 10
> Channel 42: final ovt: 18937, utilization time: 920, utilization %: inf,
> number of times utilized: 20
> Channel 43: final ovt: 3852, utilization time: 460, utilization %: inf,
> number of times utilized: 10
> Channel 44: final ovt: 0, utilization time: 0, utilization %: -nan,
> number of times utilized: 0
> Channel 45: final ovt: 10170, utilization time: 460, utilization %: inf,
> number of times utilized: 10
> ......
> Time 11029 The throughput at node 0 is -0.000000 MB/sec
> Time 13120 The throughput at node 0 is -0.000000 MB/sec
> Time 19720 The throughput at node 0 is -0.000000 MB/sec
> Time 17171 The throughput at node 0 is -0.000000 MB/sec
> Time 10488 The throughput at node 0 is -0.000000 MB/sec
> Time 18857 The throughput at node 0 is -0.000000 MB/sec
> Time 19904 The throughput at node 0 is -0.000000 MB/sec
> Time 15164 The throughput at node 0 is
> 2619573476544695804731547824400380198912.000000 MB/sec
> Time 21233 The throughput at node 0 is
> 6169089787803774459785667406587215792904297853929178820378078740480.000000
> MB/sec
> Time 15018 The throughput at node 0 is
> -1780094247336737777013341204587896966615702908491058732716495790871608899146945661692816927517573704518205440.000000
> MB/sec
> Time 10690 The throughput at node 0 is -0.000000 MB/sec
> Time 9233 The throughput at node 0 is 44821195682102730752.000000 MB/sec
> Time 19944 The throughput at node 0 is -0.000000 MB/sec
> Time 11994 The throughput at node 0 is
> -338228915617507132833854399247497671041418654640972592171596655032947275169119427816506440781096810222174101147367102083708078270434789388123125866695820022438578371058087380741055533591297634197655197290612786319533799084036838129664.000000
> MB/sec
> ......
> Time 13577 The throughput at node 0 is 0.000000 MB/sec
> Simulation inactive at time: 25762
> Final basic stats: Commits: 26562 Rollbacks: 23
> Final basic stats: GVT iterations: 11184
> 1 PE Simulation finished at 4.148911.
> Program finished.
>
>
> My intuition tells me that it is sure to be overflowing.I guess that it
> is possible to have a wrong configuration,but I am no any ideas about
> the results.
>
> I hope someone can give me some advices.
>
> your sincerely
>
> Guangjun.Qin
>
>
>
> _______________________________________________
> charm mailing list
> charm AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/charm
> _______________________________________________
> ppl mailing list
> ppl AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/ppl
>







Archive powered by MHonArc 2.6.16.

Top of Page