Skip to Content.
Sympa Menu

charm - Re: [charm] Parallel read

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] Parallel read


Chronological Thread 
  • From: Jozsef Bakosi <jbakosi AT gmail.com>
  • To: Ronak Buch <rabuch2 AT illinois.edu>
  • Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>, "rspavel AT lanl.gov" <rspavel AT lanl.gov>
  • Subject: Re: [charm] Parallel read
  • Date: Mon, 6 Mar 2017 08:52:24 -0700

Hi Ronak,

Thanks for doing the merge. Please let us know when you are done because we are eager to give a it try. We already wrote a toy code that has the first two options implemented I outlined earlier and so we would like to compare with ckio's read -- not only for performance, but also for usability.

Thanks again,
Jozsef

On Fri, Feb 24, 2017 at 5:36 PM, Ronak Buch <rabuch2 AT illinois.edu> wrote:
Hi Joszef,

I agree with your assessment, but I'd warn you that it may take some time for the CkIO read to get to a level where it performs well.  If I recall correctly, we were going to rearchitect the design we had when we were making it, so it may take some work to get it to a production ready state.

In any case, I'll take a look in the next few days, bring it up to date, test it a bit, and get back to you with a version you can play around with.

Thanks,
Ronak

On Wed, Feb 22, 2017 at 3:48 PM, Jozsef Bakosi <jbakosi AT gmail.com> wrote:
Hi Ronak,

We would like to give the existing parallel read of ckio a try. 

The way I see it at this point, our options are as follows:

1. From a chare group, do a parallel read in which not all group elements participate (how many, e.g., a percentage, controlled by the user),

2. Do the same as 1, but from a node group, using a similar user-adjustable parameter specifying the number of nodes that should participate in the read.

3. Use ckio's read.

Of these three options, I assume option 3 would also be able to overlap communication and computation, so ultimately we would probably like that best.. Also as I see, the above is also in the order of increasing complexity, but potentially, also in the order of larger return over investment.

I think we would like to give ckio's reader a try just to learn more about option 3 and to be able to better scope out what we can get out of it for our purposes using a small test application.

Could you please merge master into that branch so we can give it a try and let us know when we can try it?

Thanks,
Jozsef

On Sat, Feb 11, 2017 at 6:13 PM, Ronak Buch <rabuch2 AT illinois.edu> wrote:
Hi Jozsef,

There was some work on adding read capabilities to CkIO a while back, it's on the gerrit/rohan/ckioread branch of charm.  The reading facility added there should work, but nobody has been using it in a production applications, so it's out of date (and it hasn't been rebased on a long time).  Now that there's some interest in using parallel file reading, I'll update the branch and do some performance tests and get back to you.

Otherwise, a node group level file I/O scheme should work well in the meantime.  Your sketch on how things should be distributed makes sense to me.

Thanks,
Ronak

On Fri, Feb 10, 2017 at 9:37 PM, Jozsef Bakosi <jbakosi AT gmail.com> wrote:
Hi folks,

Currently, I have file/PE-style read of a mesh that is in a single file and about a few tens of GB containing a few hundred million cells and their node coordinates. This is okay but does not scale very well beyond a few thousand cores.

I have read the paper on ckio from 2011 and looked at the source and it is definitely something I would like to explore for writing checkpoints in the future, but I wonder what my options are for input. ckio does not seem to have a capability at this point for reading.

Another option that seems appealing is to use Charm++'s node groups and/or the "Physical Node API" for simply replacing my file/PE read with a node/PE read and then after reading my input file in that fashion in a second step I would distribute the data to group and/or chare array elements.

What do you suggest? At this point I am interested in expl oring both quick/dirty as well as good/longer-term solutions.

Thanks,
Jozsef






  • Re: [charm] Parallel read, Jozsef Bakosi, 03/06/2017

Archive powered by MHonArc 2.6.19.

Top of Page