charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] Scalable creation of chare array elements, passing a different portion of a potentially large array to their constructors

From: Jozsef Bakosi <jbakosi AT gmail.com>
To: "Kale, Laxmikant V" <kale AT illinois.edu>
Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
Subject: Re: [charm] Scalable creation of chare array elements, passing a different portion of a potentially large array to their constructors
Date: Sat, 31 Oct 2015 14:36:22 -0600

I should have probably been more specific. The data (alldata) stores element indices of a computational mesh and comes from a file. At first I read the mesh graph on PE0 and send it to Zoltan along with the number of chares I want, which gives me back coloring, i.e., the indices of the mesh cells to be operated on by a chare array element. (BTW: The solver on this mesh already works in parallel, advancing a simple PDE in 3D in time, using Hypre, another MPI library, but this one called from within Charm++ group elements. I'm now trying to scale this up to larger number of CPUs.)

On Sat, Oct 31, 2015 at 2:02 PM, Kale, Laxmikant V <kale AT illinois.edu> wrote:

Is the data for each chare array element “arbitrary” (i.e. You get it from input from a file, for example), or is it a function of its index? I assume its arbitrary (otherwise, each element could construct it itself based on thisIndex).

So yes, the data is from an input file and the particular parts are results of graph partitioning. Currently, I only have this result on PE0 and need to distribute it. This also tells me that a potential other way is to call Zoltan on an already distributed (but perhaps not optimally distributed) mesh, which might yield this data sitting already on different PEs. I just didn't yet want to go to there yet.

Secondly, is the data available already on processor 0? Or is it in a file? If it is in a file, one can use some parallel I/O abstraction (we have ckio library, but will need to check its readiness), and have each element read form a specific offset from the file, assuming they are of fixed size.

Believe it or not, I have also thought about writing this data (the result of the partitioning) to disk and read it back by the chares. That would be straightforward to do and since this is only done during an initialization stage (not during time-stepping), it might not be too bad in terms of performance. (Even if it is bad, it could be improved on with ckio.)

You can also scatter data to a distributed table and have each chare pick up what it needs from the table via an explicit request. Distributed table as an abstraction was removed years ago, but thats because it can be easily implemented via a group: just hash the array index to PE, and that where you store all the key-value pairs that map to it ( <key: index, value: data for index’th array element). I have a feeling this last one is what you want. The key-value table is a tutorial example (in sdag section, I think). We can dig that out.

That sounds pretty interesting and probably the best way to go. I will look at the examples. Can you recall the name of the example? I can only see jacobi2d-sdag, jacobi3d-sdag, and jacobi3d-sdag-constrain as sdag-related. Any of these?

Thanks,

Jozsef

[charm] Scalable creation of chare array elements, passing a different portion of a potentially large array to their constructors, Jozsef Bakosi, 10/31/2015
- Re: [charm] Scalable creation of chare array elements, passing a different portion of a potentially large array to their constructors, Kale, Laxmikant V, 10/31/2015
  - Re: [charm] Scalable creation of chare array elements, passing a different portion of a potentially large array to their constructors, Jozsef Bakosi, 10/31/2015