Skip to Content.
Sympa Menu

charm - Re: [charm] Trouble compiling charm++ on ANL-Theta

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] Trouble compiling charm++ on ANL-Theta


Chronological Thread 
  • From: Phil Miller <mille121 AT illinois.edu>
  • To: Jeff Hammond <jeff.science AT gmail.com>
  • Cc: Brian Radak <brian.radak AT gmail.com>, Evan Ramos <evan AT hpccharm.com>, "Bak, Seonmyeong" <sbak5 AT illinois.edu>, "charm AT cs.illinois.edu" <charm AT cs.illinois.edu>
  • Subject: Re: [charm] Trouble compiling charm++ on ANL-Theta
  • Date: Tue, 5 Dec 2017 12:41:31 -0600
  • Authentication-results: illinois.edu; spf=softfail smtp.mailfrom=unmobile AT gmail.com

On Tue, Dec 5, 2017 at 12:38 PM, Jeff Hammond <jeff.science AT gmail.com> wrote:
I'm curious to know what benefit you expected from AVX-512 code generation inside of the Charm++ runtime.  Does Charm++ contain vectorizable floating-point?

It's mostly a matter of being able to compile the runtime system with the same compiler options as whatever application, rather than having to fiddle around and trim things back. It's an ease of use concern, much more than one of performance.
 

If binary size isn't an issue, you can just use -xCORE-AVX2 -axMIC-AVX512 to generate object code that will run on Haswell or later and specialize for KNL.

Jeff

On Mon, Dec 4, 2017 at 2:39 PM, Brian Radak <brian.radak AT gmail.com> wrote:
Hi Evan,

Thanks for the update. I'll apply that patch and try compiling again later tonight when theta comes back from weekly maintenance. 

Brian 


On Dec 4, 2017 5:30 PM, "Evan Ramos" <evan AT hpccharm.com> wrote:
Hi Brian,

I've investigated further and I believe I've found and addressed the root cause of the issue.


-Evan

On Fri, Dec 1, 2017 at 1:13 PM, Evan Ramos <evan AT hpccharm.com> wrote:
Ah, it sounds like we'll need to figure out to avoid including `-xMIC-AVX512` when compiling for the host.

On Fri, Dec 1, 2017 at 1:06 PM, Brian Radak <brian.radak AT gmail.com> wrote:
Hi Evan,

Thanks for spending time on this.
The new patch does seem to eliminate the unintended use of gcc, but now I'm getting the following different error:

gmake[1]: Leaving directory '/gpfs/mira-home/radak/Software/namd/charm/gni-crayxc-persistent-smp/tmp'
/usr/bin/gmake headerlinks
gmake[1]: Entering directory '/gpfs/mira-home/radak/Software/namd/charm/gni-crayxc-persistent-smp/tmp'
../bin/charmc -intrinsic -optimize -production  -xMIC-AVX512  ckcallback.ci && touch ckcallback.ci.stamp
../bin/charmc -intrinsic -optimize -production  -xMIC-AVX512  DummyLB.ci && touch DummyLB.ci.stamp
../bin/charmc -intrinsic -optimize -production  -xMIC-AVX512  GreedyLB.ci && touch GreedyLB.ci.stamp
../bin/charmc -intrinsic -optimize -production  -xMIC-AVX512  GreedyRefineLB.ci && touch GreedyRefineLB.ci.stamp
../bin/charmc -intrinsic -optimize -production  -xMIC-AVX512  CommLB.ci && touch CommLB.ci.stamp
../bin/charmc -intrinsic -optimize -production  -xMIC-AVX512  RandCentLB.ci && touch RandCentLB.ci.stamp
../bin/charmc -intrinsic -optimize -production  -xMIC-AVX512  RefineLB.ci && touch RefineLB.ci.stamp
../bin/charmc -intrinsic -optimize -production  -xMIC-AVX512  RefineCommLB.ci && touch RefineCommLB.ci.stamp
../bin/charmc: line 199: 108663 Illegal instruction     (core dumped) ../bin/charmxi -intrinsic -orig-file RandCentLB.ci
../bin/charmc: line 199: 108656 Illegal instruction     (core dumped) ../bin/charmxi -intrinsic -orig-file RefineCommLB.ci
../bin/charmc: line 199: 108644 Illegal instruction     (core dumped) ../bin/charmxi -intrinsic -orig-file ckcallback.ci
../bin/charmc: line 199: 108659 Illegal instruction     (core dumped) ../bin/charmxi -intrinsic -orig-file GreedyLB.ci
../bin/charmc: line 199: 108652 Illegal instruction     (core dumped) ../bin/charmxi -intrinsic -orig-file CommLB.ci
../bin/charmc: line 199: 108668 Illegal instruction     (core dumped) ../bin/charmxi -intrinsic -orig-file GreedyRefineLB.ci
Fatal Error by charmc in directory /home/radak/Software/namd/charm/gni-crayxc-persistent-smp/tmp
Fatal Error by charmc in directory /home/radak/Software/namd/charm/gni-crayxc-persistent-smp/tmp
Fatal Error by charmc in directory /home/radak/Software/namd/charm/gni-crayxc-persistent-smp/tmp
Fatal Error by charmc in directory /home/radak/Software/namd/charm/gni-crayxc-persistent-smp/tmp
   Command ../bin/charmxi -intrinsic -orig-file GreedyLB.ci returned error code 132
   Command ../bin/charmxi -intrinsic -orig-file ckcallback.ci returned error code 132
charmc exiting...

Brian


On Fri, Dec 1, 2017 at 1:50 PM, Evan Ramos <evan AT hpccharm.com> wrote:
Hi Brian,

In testing on Edison, I encountered the same error with the build command you provided. Please try the latest update to my patch, which resolves the problem with gni-crayxc on my end. https://charm.cs.illinois.edu/gerrit/3339

--
Evan Ramos
Software Engineer
Charmworks, Inc.

On Thu, Nov 30, 2017 at 7:56 PM, Brian Radak <brian.radak AT gmail.com> wrote:
We all have better things to do...

I can compile just fine with Intel 18.0 and the charm++ 6.8.2 release. Maybe there was some weird change to the PrgEnv-intel during one of the recent maintenance periods?

On Thu, Nov 30, 2017 at 2:01 PM, Brian Radak <brian.radak AT gmail.com> wrote:
Yes - this was all done from a fresh clone.

I'm going to go back to intel 16 and see if anything changes.


On Thu, Nov 30, 2017 at 1:25 PM, Phil Miller <mille121 AT illinois.edu> wrote:
Did you try removing the gni-crayxc-persistent-smp directory and re-building from scratch?

On Thu, Nov 30, 2017 at 12:23 PM, Brian Radak <brian.radak AT gmail.com> wrote:
Thanks for the fast response.

I checked out patch 3339:

$ git pull https://charm.cs.illinois.edu/gerrit/charm refs/changes/39/3339/2

(I also tried)
$ git fetch https://charm.cs.illinois.edu/gerrit/charm refs/changes/39/3339/2 && git checkout FETCH_HEAD

and made sure of my environment

$ module load PrgEnv-intel

but still seem to get the same error. charmrun is also not produced. Sorry if I'm making a naive mistake here.

Brian


On Thu, Nov 30, 2017 at 12:50 PM, Bak, Seonmyeong <sbak5 AT illinois.edu> wrote:
Oh, I missed some messages in the beginning. 
Charm++ is built correctly. The message doesn’t mean that you failed to build Charm++

Seonmyeong Bak


On Nov 30, 2017, at 11:47 AM, Bak, Seonmyeong <sbak5 AT illinois.edu> wrote:

Hello, Brian, 

It seems that you tried to build charm++ with gcc and -xMIC-AVX512 is compile option only available on icc. 
And, we use compiler wrapper ‘cc/CC’ on Cray machines by loading ‘PrgEnv-* options'
Please check if you load 'PrgEnv-intel module' correctly and try to build charm++ again.

Bak, Seonmyeong
Ph.D student | Department of Computer Science
University of Illinois at Urbana-Champaign

On Nov 30, 2017, at 11:33 AM, Brian Radak <brian.radak AT gmail.com> wrote:

Hello,

I recently pulled the latest charm++ in order to recompile NAMD on the Argonne Theta supercomputer - the prescribed process (from Jim Philips) now fails.

The specific build command is
$ ./build charm++ gni-crayxc-persistent-smp -j8 --no-build-shared --with-production -xMIC-AVX512

We previously had issues with icc version 17, but this happens with 17, 16, and even a beta of 18. The particular (abridged) errors that stick out are:

gmake[1]: Entering directory 'charm/gni-crayxc-persistent-smp/tmp/topomanager'
gcc: error: language MIC-AVX512 not recognized
Fatal Error by charmc in directory charm/gni-crayxc-persistent-smp/tmp
   Command gcc -D_REENTRANT -I../bin/../include -D__CHARMC__=1 -U_FORTIFY_SOURCE -D_REENTRANT -I./../include -D__CHARMC__=1 -xMIC-AVX512 -O2 -U_FORTIFY_SOURCE -c conv-cpm.c -o conv-cpm.o returned error code 1

and later

global-elfgot.C(78): error: #error directive: "Global-elfgot won't work properly under smp version: -swapglobals disabled"
  #  error "Global-elfgot won't work properly under smp version: -swapglobals disabled"
compilation aborted for global-elfgot.C (code 2)
Fatal Error by charmc in directory /home/radak/Software/namd/charm/gni-crayxc-persistent-smp/tmp
   Command CC -D_REENTRANT -I../bin/../include -D__CHARMC__=1 -xMIC-AVX512 -O2 -U_FORTIFY_SOURCE -std=c++11 -D_REENTRANT -I./../include -D__CHARMC__=1 -xMIC-AVX512 -O2 -U_FORTIFY_SOURCE -c global-elfgot.C -o global-elfgot.o returned error code 2
charmc exiting...
Warning: building shared library is not supported, recompile charm++ with '--build-shared'.

Can you offer any suggestions? Should I go back to an older release just to be sure this is not an issue with updates on Theta (I haven't recompiled for several months).

Regards,
Brian Radak

former Theta ESP postdoc at ALCF
currently on staff at TCBG














--




Archive powered by MHonArc 2.6.19.

Top of Page