Skip to Content.
Sympa Menu

charm - Re: [charm] disable hwloc?

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] disable hwloc?


Chronological Thread 
  • From: Jozsef Bakosi <jbakosi AT lanl.gov>
  • To: Evan Ramos <evan AT hpccharm.com>
  • Cc: charm <charm AT lists.cs.illinois.edu>
  • Subject: Re: [charm] disable hwloc?
  • Date: Tue, 8 May 2018 07:56:26 -0600
  • Authentication-results: illinois.edu; spf=pass smtp.mailfrom=jbakosi AT lanl.gov; dmarc=pass header.from=lanl.gov

Hi Evan,

The fix below appears to work. Here is the output of running a Charm++
executable.

============================
$ export HWLOC_COMPONENTS_VERBOSE=1
$ Main/meshconv
Registered cpu discovery component `no_os' with priority 40 (statically build)
Registered global discovery component `xml' with priority 30 (statically
build)
Registered global discovery component `synthetic' with priority 30
(statically build)
Registered global discovery component `custom' with priority 30 (statically
build)
Registered cpu discovery component `linux' with priority 50 (statically build)
Registered cpu discovery component `x86' with priority 45 (statically build)
Enabling cpu discovery component `linux'
Enabling cpu discovery component `x86'
Enabling cpu discovery component `no_os'
Excluding global discovery component `xml', conflicts with excludes 0x2
Excluding global discovery component `synthetic', conflicts with excludes 0x2
Excluding global discovery component `custom', conflicts with excludes 0x2
Final list of enabled discovery components: linux,x86,no_os
Disabling cpu discovery component `linux'
Disabling cpu discovery component `x86'
Disabling cpu discovery component `no_os'
Registered cpu discovery component `no_os' with priority 40 (statically build)
Registered global discovery component `xml' with priority 30 (statically
build)
Registered global discovery component `synthetic' with priority 30
(statically build)
Registered global discovery component `custom' with priority 30 (statically
build)
Registered cpu discovery component `linux' with priority 50 (statically build)
Registered misc discovery component `linuxpci' with priority 19 (statically
build)
Registered cpu discovery component `x86' with priority 45 (statically build)
Enabling cpu discovery component `linux'
Enabling cpu discovery component `x86'
Enabling cpu discovery component `no_os'
Excluding global discovery component `xml', conflicts with excludes 0x2
Excluding global discovery component `synthetic', conflicts with excludes 0x2
Excluding global discovery component `custom', conflicts with excludes 0x2
Enabling misc discovery component `linuxpci'
Final list of enabled discovery components: linux,x86,no_os,linuxpci
Charm++> Running on MPI version: 3.0
Charm++> level of thread support used: MPI_THREAD_SINGLE (desired:
MPI_THREAD_SINGLE)
Charm++> Running in non-SMP mode: numPes 1
Converse/Charm++ Commit ID:
Warning> Randomization of virtual memory (ASLR) is turned on in the kernel,
thread migration may not work! Run 'echo 0 >
/proc/sys/kernel/randomize_va_space' as root to disable it, or try running
with '+isomalloc_sync'.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (36-way SMP).
Charm++> cpu topology info is gathered in 0.000 seconds.

meshconv Command-line Parameters:
-h, --help Display one-liner help on all command-line
arguments
-H, --helpkw string Display verbose help on a single keyword
-i, --input string Specify the input file
-o, --output string Specify the output file
-r, --reorder string Reorder mesh nodes
-v, --verbose Select verbose screen output
[Partition 0][Node 0] End of program
Disabling cpu discovery component `linux'
Disabling cpu discovery component `x86'
Disabling cpu discovery component `no_os'
Disabling misc discovery component `linuxpci'
============================

Will you commit this soon so can I pull it from the github mirror's 'charm'
branch?

Thanks a lot,
Jozsef

On 05.07.2018 14:41, Evan Ramos wrote:
> Hi Jozsef,
>
> Please try the following:
>
> 1. Adding this to hwloc/include/hwloc/rename.h in place of the
> one-line change I suggested previously, and recompiling Charm++:
>
> #define hwloc_aix_component HWLOC_NAME(aix_component)
> #define hwloc_bgq_component HWLOC_NAME(bgq_component)
> #define hwloc_cuda_component HWLOC_NAME(cuda_component)
> #define hwloc_custom_component HWLOC_NAME(custom_component)
> #define hwloc_darwin_component HWLOC_NAME(darwin_component)
> #define hwloc_fake_component HWLOC_NAME(fake_component)
> #define hwloc_freebsd_component HWLOC_NAME(freebsd_component)
> #define hwloc_gl_component HWLOC_NAME(gl_component)
> #define hwloc_hpux_component HWLOC_NAME(hpux_component)
> #define hwloc_linux_component HWLOC_NAME(linux_component)
> #define hwloc_linuxpci_component HWLOC_NAME(linuxpci_component)
> #define hwloc_netbsd_component HWLOC_NAME(netbsd_component)
> #define hwloc_noos_component HWLOC_NAME(noos_component)
> #define hwloc_nvml_component HWLOC_NAME(nvml_component)
> #define hwloc_opencl_component HWLOC_NAME(opencl_component)
> #define hwloc_osf_component HWLOC_NAME(osf_component)
> #define hwloc_pci_component HWLOC_NAME(pci_component)
> #define hwloc_solaris_component HWLOC_NAME(solaris_component)
> #define hwloc_synthetic_component HWLOC_NAME(synthetic_component)
> #define hwloc_windows_component HWLOC_NAME(windows_component)
> #define hwloc_x86_component HWLOC_NAME(x86_component)
> #define hwloc_xml_libxml_component HWLOC_NAME(xml_libxml_component)
> #define hwloc_xml_nolibxml_component HWLOC_NAME(xml_nolibxml_component)
>
> 2. Running `export HWLOC_COMPONENTS_VERBOSE=1` before your binary.
>
> Regards,
> --
> Evan A. Ramos
> Software Engineer
> Charmworks, Inc.
>
>
> On Wed, May 2, 2018 at 12:31 AM, Jozsef Bakosi
> <jbakosi AT lanl.gov>
> wrote:
> > Hi Evan,
> >
> > Applying the patch below allows linking fine, but I get a segfault at
> > runtime (running in serial):
> >
> > (gdb) run
> > Starting program: /home/quinoa/quinoa/build/Main/meshconv Main/meshconv
> > [New LWP 18798]
> >
> > Thread 1 "meshconv" received signal SIGSEGV, Segmentation fault.
> > 0x00007ffff79b4c10 in opal_hwloc191_hwloc_components_init ()
> > (gdb) where
> > #0 0x00007ffff79b4c10 in opal_hwloc191_hwloc_components_init ()
> > #1 0x00007ffff79a0077 in opal_hwloc191_hwloc_topology_init ()
> > #2 0x00007ffff7983174 in opal_hwloc_base_get_topology ()
> > #3 0x00007ffff774b686 in ompi_mpi_init ()
> > #4 0x00007ffff7760660 in PMPI_Init_thread ()
> > #5 0x00007ffff76beeee in LrtsInit (argc=0x7fffffffebdc,
> > argv=0x7fffffffebd0, numNodes=0x7ffff7feb398 <_Cmi_numnodes>,
> > myNodeID=0x7ffff7feb2e0 <_Cmi_mynode>) at machine.c:1440
> > #6 0x00007ffff76bd130 in ConverseInit (argc=2, argv=0x7fffffffeca8,
> > fn=0x7ffff75c4b0b <_initCharm(int, char**)>, usched=0, initret=0) at
> > machine-common-core.c:1286
> > #7 0x00007ffff75c2a11 in main (argc=2, argv=0x7fffffffeca8) at main.C:9
> >
> > Thanks for looking into this,
> > Jozsef
> >
> > On 05.01.2018 18:07, Evan Ramos wrote:
> >> It is not possible to disable hwloc, since we rely on it to query
> >> hardware topology and set affinities. We also cannot rely on whatever
> >> version may be linked into OpenMPI due to potential mismatches with
> >> our code. However, it looks like this issue may have a simple fix.
> >> Could you test this change:
> >>
> >>
> >> diff --git a/contrib/hwloc/include/hwloc/rename.h
> >> b/contrib/hwloc/include/hwloc/rename.h
> >> index 9a0c5fae5..39660f4d3 100644
> >> --- a/contrib/hwloc/include/hwloc/rename.h
> >> +++ b/contrib/hwloc/include/hwloc/rename.h
> >> @@ -489,6 +489,8 @@ extern "C" {
> >> #define hwloc_component_type_t HWLOC_NAME(component_type_t)
> >> #define hwloc_component HWLOC_NAME(component)
> >>
> >> +#define hwloc_linux_component HWLOC_NAME(linux_component)
> >> +
> >> #define hwloc_plugin_check_namespace HWLOC_NAME(plugin_check_namespace)
> >>
> >> #define hwloc_insert_object_by_cpuset
> >> HWLOC_NAME(insert_object_by_cpuset)
> >>
> >>
> >> If this resolves the issue, I will fix it in our tree and report it
> >> upstream so that this commit can be partially reverted:
> >> https://github.com/open-mpi/hwloc/commit/93abf09fee121c55b99f578d62e3ea21decdfbed
> >>
> >> Regards,
> >> --
> >> Evan A. Ramos
> >> Software Engineer
> >> Charmworks, Inc.
> >>
> >>
> >> On Tue, May 1, 2018 at 10:56 AM, Jozsef Bakosi
> >> <jbakosi AT lanl.gov>
> >> wrote:
> >> > Hi folks,
> >> >
> >> > Is it possible to disable hwloc in Charm++? I'm getting:
> >> >
> >> > /opt/openmpi/lib/libopen-pal.a(topology-linux.o):(.data.rel.ro.local+0x40):
> >> > multiple definition of `hwloc_linux_component'
> >> > <charm-install-dir>/charm/bin/../lib/libhwloc_embedded.a(topology-linux.o):(.data.rel.ro.local+0x0):
> >> > first defined here
> >> >
> >> > Thanks,
> >> > Jozsef



Archive powered by MHonArc 2.6.19.

Top of Page