Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] megatest fails for ibverbs and --with-production option (charm-6.4.0)

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] megatest fails for ibverbs and --with-production option (charm-6.4.0)


Chronological Thread 
  • From: Gengbin Zheng <gzheng AT illinois.edu>
  • To: Thomas Albers <talbers AT binghamton.edu>
  • Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: Re: [charm] [ppl] megatest fails for ibverbs and --with-production option (charm-6.4.0)
  • Date: Fri, 30 Mar 2012 21:09:49 -0500
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Hi,

This looks like some issue with the Charm/Converse user-level threads.
I notice your gcc version is very new, I think we have not thoroughly
tested against the latest version of gcc yet.
Can you try at link time, add:

-thread context

and see if it makes difference.

Gengbin

On Fri, Mar 30, 2012 at 7:11 PM, Thomas Albers
<talbers AT binghamton.edu>
wrote:
> Hello!
>
> I am trying to build the recent release of NAMD (2.9b2) because the
> website offers only the Linux-x86_64-ibverbs-smp-CUDA version for
> download, not the Linux-x86_64-ibverbs-CUDA version that we need.
>
> However, I'm having trouble building the charm++ (6.4.0) that comes
> with it - when it's built with ibverbs and with the --with-production
> option the megatest test suite fails.  Output below.
>
> That may be the same bug from my message on 30 October 2011 that made
> it into charm-6.4.0
>
> System information:
> ta@michelin
> ~ $ uname -a
> Linux michelin 2.6.39-gentoo-r3 #7 SMP Mon Oct 17 19:48:44 EDT 2011
> x86_64 AMD Phenom(tm) II X6 1090T Processor AuthenticAMD GNU/Linux
> ta@michelin
> ~ $ gcc --version
> gcc (Gentoo 4.5.3-r1 p1.0, pie-0.4.5) 4.5.3
>
> Please let me know if anything else is needed.
>
> Regards,
> Thomas
>
> ta@michelin
>
> ~/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest
> $ ./charmrun +p8 pgm
> Charmrun> IBVERBS version of charmrun
> Charmrun> started all node programs in 3.283 seconds.
> Converse/Charm++ Commit ID: v6.4.0-beta1-0-g5776d21
> Charm++> scheduler running in netpoll mode.
> CharmLB> Load balancer assumes all CPUs are same.
> Charm++> Running on 2 unique compute nodes (6-way SMP).
> Charm++> cpu topology info is gathered in 0.011 seconds.
> Megatest is running on 8 nodes 8 processors.
> test 0: initiated [completion_test (phil)]
> Starting test
> Created detector, starting first detection
> Started first test
> Finished second test
> Started third test
> test 0: completed (0.00 sec)
> test 1: initiated [inlineem (phil)]
> test 1: completed (0.00 sec)
> test 2: initiated [callback (olawlor)]
> *** longjmp causes uninitialized stack frame ***:
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm
> terminated
> ======= Backtrace: =========
> /lib64/libc.so.6(__fortify_fail+0x37)[0x7f12339e4c07]
> /lib64/libc.so.6(+0xeab99)[0x7f12339e4b99]
> /lib64/libc.so.6(__longjmp_chk+0x33)[0x7f12339e4b03]
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(setJcontext+0x31)[0x52ad78]
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(swapJcontext+0x20)[0x52ae28]
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(CthResume+0xc9)[0x52b027]
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(CthResumeNormalThread+0x1f)[0x58b255]
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(CmiHandleMessage+0x21)[0x58c455]
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(CsdScheduleForever+0x4e)[0x58c5bf]
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(CsdScheduler+0xd)[0x58c76a]
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(ConverseInit+0x1160)[0x58afbc]
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(main+0x2d)[0x53a7f1]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f123391be9d]
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm[0x4c0699]
> ======= Memory map: ========
> 00400000-00612000 r-xp 00000000 09:00 75965435
>  
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm
> 00812000-00813000 r--p 00212000 09:00 75965435
>  
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm
> 00813000-00824000 rw-p 00213000 09:00 75965435
>  
> /home/ta/NAMD_2.9b2_Source/charm-6.4.0/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm
> 00824000-00843000 rw-p 00000000 00:00 0
> 026ce000-029d9000 rw-p 00000000 00:00 0                                  
> [heap]
> 7f12306f1000-7f1230704000 r-xp 00000000 08:03 9707836
>  /lib64/libresolv-2.12.2.so
> 7f1230704000-7f1230904000 ---p 00013000 08:03 9707836
>  /lib64/libresolv-2.12.2.so
> 7f1230904000-7f1230905000 r--p 00013000 08:03 9707836
>  /lib64/libresolv-2.12.2.so
> 7f1230905000-7f1230906000 rw-p 00014000 08:03 9707836
>  /lib64/libresolv-2.12.2.so
> 7f1230906000-7f1230908000 rw-p 00000000 00:00 0
> 7f1230908000-7f123090d000 r-xp 00000000 08:03 9707609
>  /lib64/libnss_dns-2.12.2.so
> 7f123090d000-7f1230b0c000 ---p 00005000 08:03 9707609
>  /lib64/libnss_dns-2.12.2.so
> 7f1230b0c000-7f1230b0d000 r--p 00004000 08:03 9707609
>  /lib64/libnss_dns-2.12.2.so
> 7f1230b0d000-7f1230b0e000 rw-p 00005000 08:03 9707609
>  /lib64/libnss_dns-2.12.2.so
> 7f1230b0e000-7f1230b1a000 r-xp 00000000 08:03 9707825
>  /lib64/libnss_files-2.12.2.so
> 7f1230b1a000-7f1230d19000 ---p 0000c000 08:03 9707825
>  /lib64/libnss_files-2.12.2.so
> 7f1230d19000-7f1230d1a000 r--p 0000b000 08:03 9707825
>  /lib64/libnss_files-2.12.2.so
> 7f1230d1a000-7f1230d1b000 rw-p 0000c000 08:03 9707825
>  /lib64/libnss_files-2.12.2.so
> 7f1230d28000-7f12334d4000 rw-p 00000000 00:00 0
> 7f12334d4000-7f12334dc000 r-xp 00000000 08:03 344761
>  /usr/lib64/libmthca-rdmav2.so
> 7f12334dc000-7f12336db000 ---p 00008000 08:03 344761
>  /usr/lib64/libmthca-rdmav2.so
> 7f12336db000-7f12336dc000 r--p 00007000 08:03 344761
>  /usr/lib64/libmthca-rdmav2.so
> 7f12336dc000-7f12336dd000 rw-p 00008000 08:03 344761
>  /usr/lib64/libmthca-rdmav2.so
> 7f12336dd000-7f12336f5000 r-xp 00000000 08:03 9707538
>  /lib64/libpthread-2.12.2.so
> 7f12336f5000-7f12338f4000 ---p 00018000 08:03 9707538
>  /lib64/libpthread-2.12.2.so
> 7f12338f4000-7f12338f5000 r--p 00017000 08:03 9707538
>  /lib64/libpthread-2.12.2.so
> 7f12338f5000-7f12338f6000 rw-p 00018000 08:03 9707538
>  /lib64/libpthread-2.12.2.so
> 7f12338f6000-7f12338fa000 rw-p 00000000 00:00 0
> 7f12338fa000-7f1233a5c000 r-xp 00000000 08:03 9707551
>  /lib64/libc-2.12.2.so
> 7f1233a5c000-7f1233c5b000 ---p 00162000 08:03 9707551
>  /lib64/libc-2.12.2.so
> 7f1233c5b000-7f1233c5f000 r--p 00161000 08:03 9707551
>  /lib64/libc-2.12.2.so
> 7f1233c5f000-7f1233c60000 rw-p 00165000 08:03 9707551
>  /lib64/libc-2.12.2.so
> 7f1233c60000-7f1233c65000 rw-p 00000000 00:00 0
> 7f1233c65000-7f1233c7a000 r-xp 00000000 08:03 9707601
>  /lib64/libgcc_s.so.1
> 7f1233c7a000-7f1233e79000 ---p 00015000 08:03 9707601
>  /lib64/libgcc_s.so.1
> 7f1233e79000-7f1233e7a000 r--p 00014000 08:03 9707601
>  /lib64/libgcc_s.so.1
> 7f1233e7a000-7f1233e7b000 rw-p 00015000 08:03 9707601
>  /lib64/libgcc_s.so.1
> 7f1233e7b000-7f1233efc000 r-xp 00000000 08:03 9707527
>  /lib64/libm-2.12.2.so
> 7f1233efc000-7f12340fb000 ---p 00081000 08:03 9707527
>  /lib64/libm-2.12.2.so
> 7f12340fb000-7f12340fc000 r--p 00080000 08:03 9707527
>  /lib64/libm-2.12.2.so
> 7f12340fc000-7f12340fd000 rw-p 00081000 08:03 9707527
>  /lib64/libm-2.12.2.so
> 7f12340fd000-7f12341e9000 r-xp 00000000 08:03 15385417
>  /usr/lib64/gcc/x86_64-pc-linux-gnu/4.5.3/libstdc++.so.6.0.14
> 7f12341e9000-7f12343e8000 ---p 000ec000 08:03 15385417
>  /usr/lib64/gcc/x86_64-pc-linux-gnu/4.5.3/libstdc++.so.6.0.14
> 7f12343e8000-7f12343f0000 r--p 000eb000 08:03 15385417
>  /usr/lib64/gcc/x86_64-pc-linux-gnu/4.5.3/libstdc++.so.6.0.14
> 7f12343f0000-7f12343f2000 rw-p 000f3000 08:03 15385417
>  /usr/lib64/gcc/x86_64-pc-linux-gnu/4.5.3/libstdc++.so.6.0.14
> 7f12343f2000-7f1234407000 rw-p 00000000 00:00 0
> 7f1234407000-7f1234409000 r-xp 00000000 08:03 9707829
>  /lib64/libdl-2.12.2.so
> 7f1234409000-7f1234609000 ---p 00002000 08:03 9707829
>  /lib64/libdl-2.12.2.so
> 7f1234609000-7f123460a000 r--p 00002000 08:03 9707829
>  /lib64/libdl-2.12.2.so
> 7f123460a000-7f123460b000 rw-p 00003000 08:03 9707829
>  /lib64/libdl-2.12.2.so
> 7f123460b000-7f1234619000 r-xp 00000000 08:03 344819
>  /usr/lib64/libibverbs.so.1.0.0
> 7f1234619000-7f1234818000 ---p 0000e000 08:03 344819
>  /usr/lib64/libibverbs.so.1.0.0
> 7f1234818000-7f1234819000 r--p 0000d000 08:03 344819
>  /usr/lib64/libibverbs.so.1.0.0
> 7f1234819000-7f123481a000 rw-p 0000e000 08:03 344819
>  /usr/lib64/libibverbs.so.1.0.0
> 7f123481a000-7f1234838000 r-xp 00000000 08:03 9707833
>  /lib64/ld-2.12.2.so
> 7f123483a000-7f1234a29000 rw-p 00000000 00:00 0
> 7f1234a35000-7f1234a36000 -w-s d3804000 00:10 1201
>  /dev/infiniband/uverbs0
> 7f1234a36000-7f1234a37000 rw-p 00000000 00:00 0
> 7f1234a37000-7f1234a38000 r--p 0001d000 08:03 9707833
>  /lib64/ld-2.12.2.so
> 7f1234a38000-7f1234a39000 rw-p 0001e000 08:03 9707833
>  /lib64/ld-2.12.2.so
> 7f1234a39000-7f1234a3a000 rw-p 00000000 00:00 0
> 7fffa851d000-7fffa853e000 rw-p 00000000 00:00 0                          
> [stack]
> 7fffa857a000-7fffa857b000 r-xp 00000000 00:00 0                          
> [vdso]
> ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
>  [vsyscall]
> ------------- Processor 0 Exiting: Caught Signal ------------
> Signal: unknown signal
> [0] Stack Traceback:
>  [0:0] +0x35960  [0x7f123392f960]
>  [0:1] gsignal+0x35  [0x7f123392f8e5]
>  [0:2] abort+0x186  [0x7f1233930d66]
>  [0:3] +0x70483  [0x7f123396a483]
>  [0:4] __fortify_fail+0x37  [0x7f12339e4c07]
>  [0:5] +0xeab99  [0x7f12339e4b99]
>  [0:6] __longjmp_chk+0x33  [0x7f12339e4b03]
>  [0:7] setJcontext+0x31  [0x52ad78]
>  [0:8] swapJcontext+0x20  [0x52ae28]
>  [9] Charm++ Runtime: Resumed thread (CthResume+0xc9  [0x52b027])
> Fatal error on PE 0> unknown signal
> _______________________________________________
> charm mailing list
> charm AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/charm
> _______________________________________________
> ppl mailing list
> ppl AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/ppl





Archive powered by MHonArc 2.6.16.

Top of Page