Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] Problem building charm++ on Intel platform

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] Problem building charm++ on Intel platform


Chronological Thread 
  • From: "Van Der Wijngaart, Rob F" <rob.f.van.der.wijngaart AT intel.com>
  • To: Nikhil Jain <nikhil.jain AT acm.org>
  • Cc: "charm AT cs.illinois.edu" <charm AT cs.illinois.edu>, David Kunzman <dmkunzman AT gmail.com>
  • Subject: Re: [charm] [ppl] Problem building charm++ on Intel platform
  • Date: Fri, 12 Sep 2014 22:09:12 +0000
  • Accept-language: en-US
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Thanks, Nikhil. I just had a long conversation with Dave, who confirmed this
intent.

Rob

-----Original Message-----
From:
nikhil.life AT gmail.com

[mailto:nikhil.life AT gmail.com]
On Behalf Of Nikhil Jain
Sent: Friday, September 12, 2014 1:13 PM
To: Van Der Wijngaart, Rob F
Cc: David Kunzman;
charm AT cs.illinois.edu
Subject: Re: [charm] [ppl] Problem building charm++ on Intel platform

Yes, it need to be the total number of worker threads (#nodes *
threads per node).

I agree that one would expect it to be the way you think, but changing
historical artifacts is tough.

Mind you, if you do not use charmrun, and launch you job using mpirun,
your intuitive values will work i.e.

mpirun -np 1 ./jacobi2d 20 +ppn 4 +setcpuaffinity +pemap 0-3

will create 1 charm SMP node with 4 worker threads and 1 communication
thread, equivalent to

./charmrun +p4 ./jacobi2d 20 +ppn 4 +setcpuaffinity +pemap 0-3

--Nikhil

On Fri, Sep 12, 2014 at 3:03 PM, Van Der Wijngaart, Rob F
<rob.f.van.der.wijngaart AT intel.com>
wrote:
> We love nosiness!! Very helpful, indeed, Dave. Moving +ppn after the app
> name does the trick, though charmrun still complains when I say
>
> [rfvander@bar1
> jacobi2d]$ ./charmrun +p1 ./jacobi2d 20 +ppn 4
> +setcpuaffinity +pemap 0-3
>
> p = 1 should be a multiple of ppn = 4
>
> So it appears in +pN, N should be the total number of threads, i.e. number
> of nodes * number of threads per node. Is that intended? I would expect N
> simply to the number of nodes. Thanks.
>
>
>
> Rob
>
>
>
>
>
> From: David Kunzman
> [mailto:dmkunzman AT gmail.com]
> Sent: Friday, September 12, 2014 12:17 PM
> To: Van Der Wijngaart, Rob F
> Cc: Phil Miller;
> charm AT cs.illinois.edu
> Subject: Re: [ppl] [charm] Problem building charm++ on Intel platform
>
>
>
> Yeah, I'm nosey. :) I thought it might be helpful to know someone local who
> could help.
>
>
>
> Try putting the "+ppn" after the app name, so it's a parameter to the
> executable. As I recall, it's processed by the runtime on each node
> individually. I seem to remember that charmrun does this for you (binary
> versions), but as Phil pointed out, when you use MPI things are just passed
> off from charmrun to mpirun. If you do that, your command should be portable
> across builds. If I remembered that correctly, perhaps it would be a good
> idea if the MPI build could be modified to recognize and move it as well (so
> parameters work the same way for all builds; and perhaps drop ++verbose if
> it is present).
>
>
>
> This is all based off my memory, which is at least several years old now...
> So I might have some details wrong.
>
>
>
> Thanks,
>
> Dave K
>
>
>
>
> On Sep 12, 2014, at 11:47 AM, "Van Der Wijngaart, Rob F"
> <rob.f.van.der.wijngaart AT intel.com>
> wrote:
>
> Thanks, Phil! In the meantime Dave Kunzman (at Intel now, and collocated
> with me in Santa Clara!) started butting in on a separate thread. I have now
> built in this way to enable multi-threading:
>
> ./build charm++ mpi-linux-x86_64 ifort mpicxx smp -j8
> -DMPICH_IGNORE_CXX_SEEK -DMPICH_SKIP_MPICXX
>
> Now funny things start to happen, even before I specify your additional
> flags:
>
> [rfvander@bar1
> jacobi2d]$ ./charmrun +p1 +ppn4 ./jacobi2d 200
>
> p = 1 should be a multiple of ppn = 4
>
> [rfvander@bar1
> jacobi2d]$ ./charmrun +p4 +ppn4 ./jacobi2d 200
>
>
>
> Running on 1 processors: +ppn4 ./jacobi2d 200
>
> charmrun> /usr/bin/setarch x86_64 -R mpirun -np 1 +ppn4 ./jacobi2d 200
>
> [proxy:0:0@bar1]
> HYDU_create_process (../../utils/launch/launch.c:590):
> execvp error on file +ppn4 (No such file or directory)
>
>
>
> The first reported error is funny. It looks like a bug. The second
> apparently comes about because +ppn4 is passed to mpirun verbatim, which
> thus chokes (using "+ppn 4" gives the same result). If I change it to "-ppn
> 4", mpirun is happy, but charm++ doesn't realize I am asking for multiple
> threads. I guess you cannot have multiple threads and run them, too.
>
>
>
> Rob
>
>
>
> From:
> unmobile AT gmail.com
>
> [mailto:unmobile AT gmail.com]
> On Behalf Of Phil
> Miller
> Sent: Friday, September 12, 2014 10:54 AM
> To: Van Der Wijngaart, Rob F
> Cc:
> charm AT cs.illinois.edu
> Subject: Re: [charm] Problem building charm++ on Intel platform
>
>
>
> Ooh, I see. I forgot you were on an MPI build of Charm++, rather than the
> IP-based network layer (net-* or netlrts-*) we normally use in development.
> The charmrun utility is different for each network layer, and I think only
> the net-* version of charmrun supports that option. On other layers, since
> it's not responsible for process launching, it has nothing to report.
>
> The other flags I mentioned should be more helpful, though.
>
>
>
>
>
> _______________________________________________
> charm mailing list
> charm AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/charm
>
> _______________________________________________
> ppl mailing list
> ppl AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/ppl
>
>
> _______________________________________________
> charm mailing list
> charm AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/charm
>



--
Nikhil Jain,
nikhil.jain AT acm.org,
http://charm.cs.uiuc.edu/people/nikhil
Doctoral Candidate @ CS, UIUC





Archive powered by MHonArc 2.6.16.

Top of Page