charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

RE: [charm] Charm++ load balancing and GPUs

From: "Choi, Jaemin" <jchoi157 AT illinois.edu>
To: Jozsef Bakosi <jbakosi AT lanl.gov>
Cc: "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>
Subject: RE: [charm] Charm++ load balancing and GPUs
Date: Fri, 12 Oct 2018 21:00:04 +0000
Accept-language: en-US
Authentication-results: illinois.edu; spf=pass smtp.mailfrom=jchoi157 AT illinois.edu; dmarc=pass header.from=illinois.edu

Hi Jozsef,

GPU Manager is mainly designed to support asynchronous execution on the GPU.
Because multiple chares are mapped to each PE due to overdecomposition, one
chare should not block the GPU usage of another (which will be the situation
if data transfers and kernel executions are synchronous). GPU Manager
provides an API (named HAPI, short for Hybrid API) that allows the user to
specify a Charm++ callback function to a CUDA stream, which will be invoked
when all the prior operations are complete. This can be used to allow
multiple chares to offload GPU kernels together without them hindering each
other. There are some convenience features as well, such as error checking
and memory pooling.

You can find more information regarding the recent changes to GPU Manager
here:
https://charm.cs.illinois.edu/gerrit/#/c/charm/+/3330/16/doc/libraries/gpumanager.tex
This has been merged in the mainline and will be part of the 6.9.0 release.

As for load balancing, the user is expected to ensure that all GPU work is
complete before performing load balancing, and currently GPU load is not
taken into account unless the offloads are synchronous (because the CPU is
waiting
for the GPU work, GPU load will show up as part of the CPU load). We are
currently working on including GPU load
for asynchronous offloads as well, however. As an alternative, you can use
the model-driven load feature, where the user measures the GPU load on their
own and provides that to the runtime.

Please let me know if you have any other questions.
Thanks,

Jaemin Choi
Ph.D. Candidate in Computer Science
Research Assistant, Parallel Programming Laboratory
University of Illinois Urbana-Champaign
________________________________________
From: Jozsef Bakosi
[jbakosi AT lanl.gov]
Sent: Friday, October 12, 2018 9:13 AM
To:
charm AT lists.cs.illinois.edu
Subject: [charm] Charm++ load balancing and GPUs

Hi folks,

Apologies in advance for not doing my homework on reading the manual,
papers, and the code before asking these questions, but I'm wondering
what is the current state of interoperation of Charm++ with GPUs. In
particular:

(1) I know Charm++ has a GPU manager but I don't know what it is used
for.

(2) If I use, e.g., Kokkos, and thus use GPUs, can Charm++ still measure
load imbalances on work being done on GPUs and migrate load accordingly
just as with CPUs?

Thanks and please point me to additional information on this.
Jozsef

[charm] Charm++ load balancing and GPUs, Jozsef Bakosi, 10/12/2018
- RE: [charm] Charm++ load balancing and GPUs, Choi, Jaemin, 10/12/2018