Skip to Content.
Sympa Menu

charm - Re: [charm] Fwd: CHARM++ related questions

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] Fwd: CHARM++ related questions


Chronological Thread 
  • From: Neha Gholkar <ngholka AT ncsu.edu>
  • To: Harshitha Menon <gplkrsh2 AT illinois.edu>
  • Cc: Charm Mailing List <charm AT cs.illinois.edu>
  • Subject: Re: [charm] Fwd: CHARM++ related questions
  • Date: Mon, 3 Mar 2014 16:15:29 -0500
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Thanks. That was helpful.

How do I link a shared library with charm++ application?

Thanks
Neha


On Fri, Feb 28, 2014 at 10:51 PM, Harshitha Menon <gplkrsh2 AT illinois.edu> wrote:
That snippet of code in the load balancer is to turn on or off the local barrier. If load balancing is to be performed in atSync mode, then all the chares present on a PE will need to come to a local barrier (not a global barrier but one within a PE) before load balancing. Once a local barrier happens, the AtSync function of the load balancer on each PE is called which then collects the stats and performs load balancing.

AddLocalBarrierClient is what is used to register with the load balancer the objects residing on a PE. Whenever an array element calls AtSync, it calls AtLocalBarrier which keeps a count on how many have reached the local barrier. Once this count is equal to the total registered clients, then the barrier is reached and necessary actions (like calling the registered receivers) is done.

Usually after loadbalancing decision is made and once all the chares within a PE has migrated, they can resume their work. But if you would like to ensure that all the computation is resumed (ResumeFromSync is called) only after all the migrations on all the PEs are done, you can use +LBSyncResume which enforces a global barrier before. This global barrier is implemented as a reduction. 
CkCallback cb(CkIndex_CentralLB::ResumeClients((CkReductionMsg*)NULL), thisProxy);
contribute(0, NULL, CkReduction::sum_int, cb);

If you are writing a new load balancer and would like send some user data along with the load balancing stats, there is functionality available in the latest version of Charm. You can take a look at the example in tests/charm++/load_balancing/lb_userdata_test/ .


On Fri, Feb 28, 2014 at 5:39 PM, Neha Gholkar <ngholka AT ncsu.edu> wrote:
I am trying to use the LB framework to send some stats across nodes just the way LB framework sends the load related stats. While doing so I intend to establish a barrier across a set of nodes. So in short I am trying to play with the load balancing framework.


On Fri, Feb 28, 2014 at 6:13 PM, Phil Miller <mille121 AT illinois.edu> wrote:
On Fri, Feb 28, 2014 at 4:02 PM, Neha Gholkar <ngholka AT ncsu.edu> wrote:
4. I intend to implement something like point to point blocking send receive in the charm++ framework. I understand that CHARM++ is based on the message-passing paradigm but could it be possible to do this?

We implement constructs that act like blocking send and receive in several different ways, in various parts of the provided infrastructure
1. Futures
2. Entry methods with the 'sync' attribute returning a value
3. AMPI_Ssend

In general, however, they're not something to be sought out in Charm++ application design. Asynchronous message passing satisfies data dependences of interacting parallel objects. Synchronization is only desirable for parallel algorithms that gain additional information from it - e.g. Hoefler's Dynamic Sparse Data Exchange, Langer et al. on AMR [1], and so forth.


I have been looking into the load balancing framework of charm++. I have a few questions.

1. How does charm++ ensure barriers when sending and receiving data? 

2. How are barriers in general implemented in charm++?

With the above in mind, barriers take the synchronization expense of blocking communication and generalize it to the entire parallel job. Again, there needs to be something gained from them to make them worthwhile.

As such, Charm++ provides some such constructs, but their use should be minimized
1. Array reductions carrying no data - if you need to synchronize multiple chare arrays, do a reduction over each, and only proceed after getting a signal from each of them
2. Quiescence Detection - this is a heavy hammer, in that it globally senses when the entire system has gone idle. Thus, it's utterly non-composable across modules or across multiple instances of a single module
3. Termination detection library - this is a more localized library (relative to QD) that can be used in Charm++ programs, though I don't think we've included it in the main repository. if you have a need for it, I can make sure the code is available.
 

3. I came across this part of the code which implements blocking. Could you please help me under stand what happens exactly when RegisteringObjects and UnregisteringObjects is used?
 // enfore the barrier to wait until centralLB says no
  LDOMHandle h;
  h.id.id.idx = 0;      /////// necessary
  theLbdb->getLBDB()->RegisteringObjects(h);

One of the developers who hacks on the load balancing infrastructure can comment on this if they wish. I'm curious what, exactly, led you to this snippet of code? Are you working on applications development, load balancer strategy development, or making the Charm++ load balancing framework itself do something new? I can only see an answer to this providing any meaningful insight in the last case. If you're just writing a new load balancer, note that we use a modular plugin design such that each individual kind of load balancer need not concern itself with actually gathering data about the parallel objects or migrating them around, only with determining where they should each live and work.






Archive powered by MHonArc 2.6.16.

Top of Page