charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] Heterogenous Load Balancing

From: Abhinav Bhatele <bhatele AT illinoisalumni.org>
To: Justin Luitjens <jluitjens AT nvidia.com>
Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>, "Kale, Laxmikant V" <kale AT illinois.edu>
Subject: Re: [charm] [ppl] Heterogenous Load Balancing
Date: Thu, 13 Oct 2011 09:06:55 -0700
List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

For a quick test, you could use the "available" flag in the struct processorInfo and set that flag to false for the master process. However, all that this flag does is to not use those processes during load balancing. The objects allocated to them initially are not offloaded. So if you want less or no work on the master processes, the initial mapping code which assigns work to processes would also have to be changed.

- Abhinav

On Thu, Oct 13, 2011 at 8:57 AM, Justin Luitjens <jluitjens AT nvidia.com> wrote:

Thank you for your reply. Here is what we are doing. We are using the GPU enabled version of NAMD. In this version processes on the same node collocate their data onto a master process (within the node). Once the master process receives data from all the other processes it then submits the work for computeNonbondedForces to the GPU. After this finishes the master process then sends the results back to the workers. The problem that we are seeing is that the master process is also having work scheduled on it. As a result of this the master process tends to be overloaded causing the entire computation to slow down. What we want to do is have a way to give the master process less of its own work or maybe even no work at all so that it can focus on managing the GPU only.

I was looking at the TorusLB.C and RefineTorusLB.C code yesterday and it wasn’t clear what changes need to be made in order to have it respect the processor speed. Right now we are just looking for some quick and dirty way to test the performance as we shift workloads away from the master process. This could potentially be done by weighting some of the formulas in TorusLB and RefineTorusLB but what changes would need to be made are not clear. Any help you could give me to accomplish this would be greatly appreciated.

Thanks,
Justin

From: bhatele AT gmail.com [mailto:bhatele AT gmail.com] On Behalf Of Abhinav Bhatele
Sent: Wednesday, October 12, 2011 10:55 PM
To: Justin Luitjens
Cc: Kale, Laxmikant V; charm AT cs.uiuc.edu
Subject: Re: [ppl] [charm] Heterogenous Load Balancing

Hi Justin,

NAMD uses a specialized load balancer. If you wanted this for NAMD, you would have to modify NAMD source code. The struct processorInfo in elements.h is used. You'll need to add a pe_speed variable there.

Then you will need to adapt the load balancers in NAMD (the source for which is in TorusLB.C and RefineTorusLB.C) to respect pe_speed's and assign load accordingly.

We can provide more assistance depending on what you want to do.

- Abhinav

On Wed, Oct 12, 2011 at 12:26 PM, Justin Luitjens <jluitjens AT nvidia.com> wrote:

Yes I am working in NAMD. I found the function LDProcessorSpeed() in the charm framework. This looks to be a place where I could edit charm to do what I want but it would be better if there were an API to do this. Thank you for your help on this. In our case the processor speed that is measured will be the same, but we have one process on each node that is specialized. We essentially want that process to be assigned a reduced amount of work so that it has more time for other activities.

Justin

From: Kale, Laxmikant V [mailto:kale AT illinois.edu]
Sent: Wednesday, October 12, 2011 11:58 AM
To: Justin Luitjens; charm AT cs.uiuc.edu
Subject: Re: [charm] Heterogenous Load Balancing

Yes, it is very much possible.. (Its been in the deign of charm load balancers for a long time.)

pe_speed is the correct variable. However, you also need top use a balancer (strategy) that pays attention to this variable.

Also, this variable is currently set by running some standard loop at the beginning of the computation. So, make sure its correctly normalized by printing the value for all (or multiple) processors,

Are you doing this for NAMD? There may be a more specialized answer for namd load balancer.

--

Laxmikant (Sanjay) Kale   http://charm.cs.uiuc.edu

Professor, Computer Science     kale AT illinois.edu

201 N. Goodwin Avenue           Ph:  (217) 244-0094

Urbana, IL  61801-2302          FAX: (217) 265-6582

On 10/12/11 1:23 PM, "Justin Luitjens" <jluitjens AT nvidia.com> wrote:

Hi,

I’ve been digging into the load balancer and it seems that if we were able to set the pe_speed for each process we would be able to accomplish this. Is there an API available to set the CPU speed manually?

Thanks,
Justin

From: charm-bounces AT cs.uiuc.edu [mailto:charm-bounces AT cs.uiuc.edu] On Behalf Of Justin Luitjens
Sent: Wednesday, October 12, 2011 8:23 AM
To: charm AT cs.uiuc.edu
Subject: [charm] Heterogenous Load Balancing

Hello,

Is it possible to specify processor weights in charm for heterogenous systems or does it assume homogenous processors? We would like to set a certain subset of PE’s so that work less work is distributed to them. Is this possible?

Thanks,
Justin

This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

_______________________________________________ charm mailing list charm AT cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/charm

_______________________________________________
charm mailing list
charm AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/charm

_______________________________________________
ppl mailing list
ppl AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/ppl

--
Abhinav Bhatele, people.llnl.gov/bhatele1
Center for Applied Scientific Computing, Lawrence Livermore National Laboratory

--
Abhinav Bhatele, people.llnl.gov/bhatele1
Center for Applied Scientific Computing, Lawrence Livermore National Laboratory

[charm] Heterogenous Load Balancing, Justin Luitjens, 10/12/2011
- Re: [charm] Heterogenous Load Balancing, Justin Luitjens, 10/12/2011
  - Re: [charm] Heterogenous Load Balancing, Kale, Laxmikant V, 10/12/2011
    - Re: [charm] Heterogenous Load Balancing, Justin Luitjens, 10/12/2011
      - Re: [charm] [ppl] Heterogenous Load Balancing, Abhinav Bhatele, 10/13/2011
        
        Re: [charm] [ppl] Heterogenous Load Balancing, Justin Luitjens, 10/13/2011
        
        Re: [charm] [ppl] Heterogenous Load Balancing, Abhinav Bhatele, 10/13/2011