Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] exploiting multi-link bandwidth of on Blue Gene

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] exploiting multi-link bandwidth of on Blue Gene


Chronological Thread 
  • From: Sameer Kumar <sameerk AT us.ibm.com>
  • To: Edgar Solomonik <solomon AT eecs.berkeley.edu>
  • Cc: charm AT cs.illinois.edu, ppl-bounces AT cs.uiuc.edu
  • Subject: Re: [charm] [ppl] exploiting multi-link bandwidth of on Blue Gene
  • Date: Tue, 3 Jan 2012 22:45:57 +0530
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>


I think we disabled rzv messages for some reason, Eric should know best.  So compile charm with -DOPT_RZV=1  to get full six link throughput.  Regular charm++ messages ~1MB in size should achieve close to full throughput.

-Sameer.



Edgar Solomonik <solomon AT eecs.berkeley.edu>
Sent by: ppl-bounces AT cs.uiuc.edu

12/30/2011 04:08 AM

To
PPL <ppl AT cs.uiuc.edu>, charm AT cs.illinois.edu
cc
Subject
[ppl] exploiting multi-link bandwidth of on Blue Gene





Hello,

I've been working on an algorithm that tries to exploit the bandwidth of every link on a torus network.  Basically, each node sends data to neighbors in each dimension of the torus, rather than a single dimension.  The target is to achieve injection bandwidth rather than link bandwidth on torus networks (e.g. on BG/P injection bandwidth is 6x link bandwidth, and on BG/Q injection bandwidth is 10x link bandwidth).

I have been able to employ this idea to get a significant performance imporvement on BG/P for an implementation of matrix multiplication that uses MPI_Put.  I also wanted virtualization so I implemented the same algorithm in Charm++.  The topology-aware mapping works and the Charm++ version performs almost as well as the original MPI version.  However, I've been unable to get the Charm version to saturate multiple links on BG/P.  I even tried running with virtual node mode on BG/P and having chares on different processes within the same node send messages along different torus directions. 

To summarise, I want to have a chare send multiple simultaneous messages to chares located on torus neighbors in different directions.  Can I use CkDirect or some other technique to achieve the above goal in Charm++?  Basically, I need an asynchronous (one-sided) send implementation  on BG/P.

Thanks,
Edgar
_______________________________________________
ppl mailing list
ppl AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/ppl




Archive powered by MHonArc 2.6.16.

Top of Page