Skip to Content.
Sympa Menu

charm - RE: [charm] libmpich-gnu linkage in libconv-util.so on Cray Titan

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

RE: [charm] libmpich-gnu linkage in libconv-util.so on Cray Titan


Chronological Thread 
  • From: Benjamin Welton <welton AT cs.wisc.edu>
  • To: Phil Miller <mille121 AT illinois.edu>
  • Cc: "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>
  • Subject: RE: [charm] libmpich-gnu linkage in libconv-util.so on Cray Titan
  • Date: Sat, 10 Sep 2016 19:54:29 +0000
  • Accept-language: en-US
  • Authentication-results: spf=none (sender IP is ) smtp.mailfrom=welton AT cs.wisc.edu;
  • Spamdiagnosticmetadata: NSPM
  • Spamdiagnosticoutput: 1:99

Hey Phil,

 

Thanks for the quick reply. Thanks for confirming that libmpich is not supposed to be linked in. It’s inclusion was pretty mystifying and even a grep of the charm directory looking for “-lmpich” turned up nothing. I had a feeling it was something weird like a compiler wrapper issue, but I figured id check just to be safe (and get this out onto the mailing list in case someone else runs into this issue). I am going to ask someone at Cray about why this is happening because if its happening with Charm, its only a matter of time before I run into it again with some other piece of software (or the software I maintain).

 

As for dynamic linking, I have a “toolkit” (I use this term very loosely since it’s a bit specific in nature) to collect performance/statistical information from applications. I have been collecting information from a wide variety of applications (up until now mainly MPI applications) and I wanted to branch out into a few charm++ applications for this study. In the case of ChaNGa, my collection framework would like to be notified when CkWaitQD() [and a select set of other calls] have been called. With dynamic linkage, I can record this call without modifying the application or the framework. Mainly this is a time saver for me in that I do not have to make massive changes to an existing collection framework (and all of its associated tools/etc) to support Charm++ applications.

 

Ben

 

 

 

From: unmobile AT gmail.com [mailto:unmobile AT gmail.com] On Behalf Of Phil Miller
Sent: Saturday, September 10, 2016 1:12 PM
To: Benjamin Welton <welton AT cs.wisc.edu>
Cc: charm AT lists.cs.illinois.edu
Subject: Re: [charm] libmpich-gnu linkage in libconv-util.so on Cray Titan

 

Hi Ben,

My first guess at a cause for the behavior you've observed is simply the Cray compiler wrapper being somewhat presumptuous when asked to build shared libraries. Unloading the module while building Charm++ seems to be a perfectly reasonable workaround. I don't see why it should have been there in the first place. I'm not sure we can quite automate removal or detection of the offending module, so that you and others don't stumble on this again in the future. We'll look into it.

As for the reasons for dynamic linking, is it just because you're building to use CUDA? Or is there some additional interesting twist that we may be able to support better?

Phil

 

 

On Sat, Sep 10, 2016 at 1:03 PM, Benjamin Welton <welton AT cs.wisc.edu> wrote:

Hello All,

 

Recently when compiling and running the application ChaNGa built with dynamic linkage* to Charm++ (git rev: 6e41e5), I was running into a segfault in libconv-util.so involving libmpich-gnu. I am building Charm++ with the following build line:

 

>  ./build ChaNGa gni-crayxe-cuda hugepages --with-production -j8 --build-shared

 

During the build process, libconv-util.so would be created and would have a dynamic link to libmpich_gnu_49.so (included in the default module cray-mpich/7.4.0). I was able to resolve the segfault by unloading the module prior to compilation of Charm++ (resulting in libconv-util.so not being linked with libmpich and ChaNGa launching/running correctly).

 

I am wondering if libmpich was supposed to be linked at all with libconv-util.so. Since I am not building AMPI, I was a bit surprised to see this linked anywhere in this build. If libmpich was in fact supposed to be linked on this build and shouldn’t have been removed, what version(s) of cray-mpich (or another mpich library) are known to work with Charm++?  

 

Thanks for your time,

Ben

 

* I understand fully that building applications with dynamic linkage on Cray machines is generally not advised. There is a specific purpose/reason that I have modified the application ChaNGa to build dynamically instead of statically to Charm++.  

 




Archive powered by MHonArc 2.6.19.

Top of Page