Skip to Content.
Sympa Menu

charm - Re: [charm] disable hwloc?

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] disable hwloc?


Chronological Thread 
  • From: Jozsef Bakosi <jbakosi AT lanl.gov>
  • To: Evan Ramos <evan AT hpccharm.com>
  • Cc: charm <charm AT lists.cs.illinois.edu>
  • Subject: Re: [charm] disable hwloc?
  • Date: Tue, 8 May 2018 10:53:48 -0600
  • Authentication-results: illinois.edu; spf=pass smtp.mailfrom=jbakosi AT lanl.gov; dmarc=pass header.from=lanl.gov

Great! Thanks, Evan.
Jozsef

On 05.08.2018 11:47, Evan Ramos wrote:
> Hi Jozsef,
>
> I'm glad the fix worked. I am now in the process of pushing it to
> Charm++, as well as requesting to upstream it to hwloc:
>
> https://charm.cs.illinois.edu/gerrit/4145
>
> https://github.com/open-mpi/hwloc/pull/311
>
> Regards,
> --
> Evan A. Ramos
> Software Engineer
> Charmworks, Inc.
>
>
> On Tue, May 8, 2018 at 8:56 AM, Jozsef Bakosi
> <jbakosi AT lanl.gov>
> wrote:
> > Hi Evan,
> >
> > The fix below appears to work. Here is the output of running a Charm++
> > executable.
> >
> > ============================
> > $ export HWLOC_COMPONENTS_VERBOSE=1
> > $ Main/meshconv
> > Registered cpu discovery component `no_os' with priority 40 (statically
> > build)
> > Registered global discovery component `xml' with priority 30 (statically
> > build)
> > Registered global discovery component `synthetic' with priority 30
> > (statically build)
> > Registered global discovery component `custom' with priority 30
> > (statically build)
> > Registered cpu discovery component `linux' with priority 50 (statically
> > build)
> > Registered cpu discovery component `x86' with priority 45 (statically
> > build)
> > Enabling cpu discovery component `linux'
> > Enabling cpu discovery component `x86'
> > Enabling cpu discovery component `no_os'
> > Excluding global discovery component `xml', conflicts with excludes 0x2
> > Excluding global discovery component `synthetic', conflicts with excludes
> > 0x2
> > Excluding global discovery component `custom', conflicts with excludes 0x2
> > Final list of enabled discovery components: linux,x86,no_os
> > Disabling cpu discovery component `linux'
> > Disabling cpu discovery component `x86'
> > Disabling cpu discovery component `no_os'
> > Registered cpu discovery component `no_os' with priority 40 (statically
> > build)
> > Registered global discovery component `xml' with priority 30 (statically
> > build)
> > Registered global discovery component `synthetic' with priority 30
> > (statically build)
> > Registered global discovery component `custom' with priority 30
> > (statically build)
> > Registered cpu discovery component `linux' with priority 50 (statically
> > build)
> > Registered misc discovery component `linuxpci' with priority 19
> > (statically build)
> > Registered cpu discovery component `x86' with priority 45 (statically
> > build)
> > Enabling cpu discovery component `linux'
> > Enabling cpu discovery component `x86'
> > Enabling cpu discovery component `no_os'
> > Excluding global discovery component `xml', conflicts with excludes 0x2
> > Excluding global discovery component `synthetic', conflicts with excludes
> > 0x2
> > Excluding global discovery component `custom', conflicts with excludes 0x2
> > Enabling misc discovery component `linuxpci'
> > Final list of enabled discovery components: linux,x86,no_os,linuxpci
> > Charm++> Running on MPI version: 3.0
> > Charm++> level of thread support used: MPI_THREAD_SINGLE (desired:
> > MPI_THREAD_SINGLE)
> > Charm++> Running in non-SMP mode: numPes 1
> > Converse/Charm++ Commit ID:
> > Warning> Randomization of virtual memory (ASLR) is turned on in the
> > kernel, thread migration may not work! Run 'echo 0 >
> > /proc/sys/kernel/randomize_va_space' as root to disable it, or try
> > running with '+isomalloc_sync'.
> > CharmLB> Load balancer assumes all CPUs are same.
> > Charm++> Running on 1 unique compute nodes (36-way SMP).
> > Charm++> cpu topology info is gathered in 0.000 seconds.
> >
> > meshconv Command-line Parameters:
> > -h, --help Display one-liner help on all
> > command-line arguments
> > -H, --helpkw string Display verbose help on a single keyword
> > -i, --input string Specify the input file
> > -o, --output string Specify the output file
> > -r, --reorder string Reorder mesh nodes
> > -v, --verbose Select verbose screen output
> > [Partition 0][Node 0] End of program
> > Disabling cpu discovery component `linux'
> > Disabling cpu discovery component `x86'
> > Disabling cpu discovery component `no_os'
> > Disabling misc discovery component `linuxpci'
> > ============================
> >
> > Will you commit this soon so can I pull it from the github mirror's
> > 'charm'
> > branch?
> >
> > Thanks a lot,
> > Jozsef
> >
> > On 05.07.2018 14:41, Evan Ramos wrote:
> >> Hi Jozsef,
> >>
> >> Please try the following:
> >>
> >> 1. Adding this to hwloc/include/hwloc/rename.h in place of the
> >> one-line change I suggested previously, and recompiling Charm++:
> >>
> >> #define hwloc_aix_component HWLOC_NAME(aix_component)
> >> #define hwloc_bgq_component HWLOC_NAME(bgq_component)
> >> #define hwloc_cuda_component HWLOC_NAME(cuda_component)
> >> #define hwloc_custom_component HWLOC_NAME(custom_component)
> >> #define hwloc_darwin_component HWLOC_NAME(darwin_component)
> >> #define hwloc_fake_component HWLOC_NAME(fake_component)
> >> #define hwloc_freebsd_component HWLOC_NAME(freebsd_component)
> >> #define hwloc_gl_component HWLOC_NAME(gl_component)
> >> #define hwloc_hpux_component HWLOC_NAME(hpux_component)
> >> #define hwloc_linux_component HWLOC_NAME(linux_component)
> >> #define hwloc_linuxpci_component HWLOC_NAME(linuxpci_component)
> >> #define hwloc_netbsd_component HWLOC_NAME(netbsd_component)
> >> #define hwloc_noos_component HWLOC_NAME(noos_component)
> >> #define hwloc_nvml_component HWLOC_NAME(nvml_component)
> >> #define hwloc_opencl_component HWLOC_NAME(opencl_component)
> >> #define hwloc_osf_component HWLOC_NAME(osf_component)
> >> #define hwloc_pci_component HWLOC_NAME(pci_component)
> >> #define hwloc_solaris_component HWLOC_NAME(solaris_component)
> >> #define hwloc_synthetic_component HWLOC_NAME(synthetic_component)
> >> #define hwloc_windows_component HWLOC_NAME(windows_component)
> >> #define hwloc_x86_component HWLOC_NAME(x86_component)
> >> #define hwloc_xml_libxml_component HWLOC_NAME(xml_libxml_component)
> >> #define hwloc_xml_nolibxml_component HWLOC_NAME(xml_nolibxml_component)
> >>
> >> 2. Running `export HWLOC_COMPONENTS_VERBOSE=1` before your binary.
> >>
> >> Regards,
> >> --
> >> Evan A. Ramos
> >> Software Engineer
> >> Charmworks, Inc.
> >>
> >>
> >> On Wed, May 2, 2018 at 12:31 AM, Jozsef Bakosi
> >> <jbakosi AT lanl.gov>
> >> wrote:
> >> > Hi Evan,
> >> >
> >> > Applying the patch below allows linking fine, but I get a segfault at
> >> > runtime (running in serial):
> >> >
> >> > (gdb) run
> >> > Starting program: /home/quinoa/quinoa/build/Main/meshconv Main/meshconv
> >> > [New LWP 18798]
> >> >
> >> > Thread 1 "meshconv" received signal SIGSEGV, Segmentation fault.
> >> > 0x00007ffff79b4c10 in opal_hwloc191_hwloc_components_init ()
> >> > (gdb) where
> >> > #0 0x00007ffff79b4c10 in opal_hwloc191_hwloc_components_init ()
> >> > #1 0x00007ffff79a0077 in opal_hwloc191_hwloc_topology_init ()
> >> > #2 0x00007ffff7983174 in opal_hwloc_base_get_topology ()
> >> > #3 0x00007ffff774b686 in ompi_mpi_init ()
> >> > #4 0x00007ffff7760660 in PMPI_Init_thread ()
> >> > #5 0x00007ffff76beeee in LrtsInit (argc=0x7fffffffebdc,
> >> > argv=0x7fffffffebd0, numNodes=0x7ffff7feb398 <_Cmi_numnodes>,
> >> > myNodeID=0x7ffff7feb2e0 <_Cmi_mynode>) at machine.c:1440
> >> > #6 0x00007ffff76bd130 in ConverseInit (argc=2, argv=0x7fffffffeca8,
> >> > fn=0x7ffff75c4b0b <_initCharm(int, char**)>, usched=0, initret=0) at
> >> > machine-common-core.c:1286
> >> > #7 0x00007ffff75c2a11 in main (argc=2, argv=0x7fffffffeca8) at
> >> > main.C:9
> >> >
> >> > Thanks for looking into this,
> >> > Jozsef
> >> >
> >> > On 05.01.2018 18:07, Evan Ramos wrote:
> >> >> It is not possible to disable hwloc, since we rely on it to query
> >> >> hardware topology and set affinities. We also cannot rely on whatever
> >> >> version may be linked into OpenMPI due to potential mismatches with
> >> >> our code. However, it looks like this issue may have a simple fix.
> >> >> Could you test this change:
> >> >>
> >> >>
> >> >> diff --git a/contrib/hwloc/include/hwloc/rename.h
> >> >> b/contrib/hwloc/include/hwloc/rename.h
> >> >> index 9a0c5fae5..39660f4d3 100644
> >> >> --- a/contrib/hwloc/include/hwloc/rename.h
> >> >> +++ b/contrib/hwloc/include/hwloc/rename.h
> >> >> @@ -489,6 +489,8 @@ extern "C" {
> >> >> #define hwloc_component_type_t HWLOC_NAME(component_type_t)
> >> >> #define hwloc_component HWLOC_NAME(component)
> >> >>
> >> >> +#define hwloc_linux_component HWLOC_NAME(linux_component)
> >> >> +
> >> >> #define hwloc_plugin_check_namespace
> >> >> HWLOC_NAME(plugin_check_namespace)
> >> >>
> >> >> #define hwloc_insert_object_by_cpuset
> >> >> HWLOC_NAME(insert_object_by_cpuset)
> >> >>
> >> >>
> >> >> If this resolves the issue, I will fix it in our tree and report it
> >> >> upstream so that this commit can be partially reverted:
> >> >> https://github.com/open-mpi/hwloc/commit/93abf09fee121c55b99f578d62e3ea21decdfbed
> >> >>
> >> >> Regards,
> >> >> --
> >> >> Evan A. Ramos
> >> >> Software Engineer
> >> >> Charmworks, Inc.
> >> >>
> >> >>
> >> >> On Tue, May 1, 2018 at 10:56 AM, Jozsef Bakosi
> >> >> <jbakosi AT lanl.gov>
> >> >> wrote:
> >> >> > Hi folks,
> >> >> >
> >> >> > Is it possible to disable hwloc in Charm++? I'm getting:
> >> >> >
> >> >> > /opt/openmpi/lib/libopen-pal.a(topology-linux.o):(.data.rel.ro.local+0x40):
> >> >> > multiple definition of `hwloc_linux_component'
> >> >> > <charm-install-dir>/charm/bin/../lib/libhwloc_embedded.a(topology-linux.o):(.data.rel.ro.local+0x0):
> >> >> > first defined here
> >> >> >
> >> >> > Thanks,
> >> >> > Jozsef



Archive powered by MHonArc 2.6.19.

Top of Page