Skip to Content.
Sympa Menu

charm - [charm] megatest fails for ibverbs and --with-production option

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] megatest fails for ibverbs and --with-production option


Chronological Thread 
  • From: Thomas Albers <talbers AT binghamton.edu>
  • To: charm AT cs.uiuc.edu
  • Subject: [charm] megatest fails for ibverbs and --with-production option
  • Date: Sun, 30 Oct 2011 23:24:19 -0400
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Hello!

When charm++ is build with ibverbs and with the --with-production
option, like so
ta@michelin
~/NAMD_2.8_Source/charm-6.3.2 $ ./build charm++
net-linux-x86_64 ibverbs --with-production
the megatest suite fails. Please see output below.

This is on a recent Gentoo Linux system with gcc.
ta@michelin
~ $ uname -a
Linux michelin 2.6.39-gentoo-r3 #7 SMP Mon Oct 17 19:48:44 EDT 2011
x86_64 AMD Phenom(tm) II X6 1090T Processor AuthenticAMD GNU/Linux
ta@michelin
~ $ gcc --version
gcc (Gentoo 4.5.3-r1 p1.0, pie-0.4.5) 4.5.3

When charm++ is built without --with-production the test suite
completes uneventfully. The net-linux-x86_64 (without ibverbs) version
works fine as well, both with and without --with-production.

Is this more likely to be a compiler bug, or a bug within Charm?
Please let me know if any further information is needed.

Regards,
Thomas



ta@michelin
~/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest
$ ./charmrun +p4 pgm
Charmrun> IBVERBS version of charmrun
Charmrun> started all node programs in 1.251 seconds.
Charm++> scheduler running in netpoll mode.
Charm++> Running on 4 unique compute nodes (6-way SMP).
Charm++> cpu topology info is gathered in 0.001 seconds.
Megatest is running on 4 nodes 4 processors.
test 0: initiated [completion_test (phil)]
Starting test
Created detector, starting first detection
Started first test
Finished second test
Started third test
test 0: completed (0.00 sec)
test 1: initiated [inlineem (phil)]
test 1: completed (0.00 sec)
test 2: initiated [callback (olawlor)]
*** longjmp causes uninitialized stack frame ***:
/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm
terminated
======= Backtrace: =========
/lib64/libc.so.6(__fortify_fail+0x37)[0x7f9a2b271c07]
/lib64/libc.so.6(+0xeab99)[0x7f9a2b271b99]
/lib64/libc.so.6(__longjmp_chk+0x33)[0x7f9a2b271b03]
/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(setJcontext+0x31)[0x51567a]
/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(swapJcontext+0x20)[0x51572a]
/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(CthResume+0xc9)[0x515929]
/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(CthResumeNormalThread+0x1f)[0x571155]
/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(CmiHandleMessage+0x21)[0x5722f8]
/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(CsdScheduleForever+0x4e)[0x57245c]
/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(CsdScheduler+0xd)[0x57260a]
/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(ConverseInit+0x1153)[0x570eb7]
/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm(main+0x2d)[0x524da1]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f9a2b1a8e9d]
/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm[0x4b2e89]
======= Memory map: ========
00400000-005f0000 r-xp 00000000 09:00 70035716

/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm
007ef000-007f0000 r--p 001ef000 09:00 70035716

/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm
007f0000-00801000 rw-p 001f0000 09:00 70035716

/home/ta/NAMD_2.8_Source/charm-6.3.2/net-linux-x86_64-ibverbs/tests/charm++/megatest/pgm
00801000-00820000 rw-p 00000000 00:00 0
01941000-01c08000 rw-p 00000000 00:00 0
[heap]
7f9a285a6000-7f9a285b2000 r-xp 00000000 08:03 9707825
/lib64/libnss_files-2.12.2.so
7f9a285b2000-7f9a287b1000 ---p 0000c000 08:03 9707825
/lib64/libnss_files-2.12.2.so
7f9a287b1000-7f9a287b2000 r--p 0000b000 08:03 9707825
/lib64/libnss_files-2.12.2.so
7f9a287b2000-7f9a287b3000 rw-p 0000c000 08:03 9707825
/lib64/libnss_files-2.12.2.so
7f9a287bd000-7f9a2ad61000 rw-p 00000000 00:00 0
7f9a2ad61000-7f9a2ad69000 r-xp 00000000 08:03 344761
/usr/lib64/libmthca-rdmav2.so
7f9a2ad69000-7f9a2af68000 ---p 00008000 08:03 344761
/usr/lib64/libmthca-rdmav2.so
7f9a2af68000-7f9a2af69000 r--p 00007000 08:03 344761
/usr/lib64/libmthca-rdmav2.so
7f9a2af69000-7f9a2af6a000 rw-p 00008000 08:03 344761
/usr/lib64/libmthca-rdmav2.so
7f9a2af6a000-7f9a2af82000 r-xp 00000000 08:03 9707538
/lib64/libpthread-2.12.2.so
7f9a2af82000-7f9a2b181000 ---p 00018000 08:03 9707538
/lib64/libpthread-2.12.2.so
7f9a2b181000-7f9a2b182000 r--p 00017000 08:03 9707538
/lib64/libpthread-2.12.2.so
7f9a2b182000-7f9a2b183000 rw-p 00018000 08:03 9707538
/lib64/libpthread-2.12.2.so
7f9a2b183000-7f9a2b187000 rw-p 00000000 00:00 0
7f9a2b187000-7f9a2b2e9000 r-xp 00000000 08:03 9707551
/lib64/libc-2.12.2.so
7f9a2b2e9000-7f9a2b4e8000 ---p 00162000 08:03 9707551
/lib64/libc-2.12.2.so
7f9a2b4e8000-7f9a2b4ec000 r--p 00161000 08:03 9707551
/lib64/libc-2.12.2.so
7f9a2b4ec000-7f9a2b4ed000 rw-p 00165000 08:03 9707551
/lib64/libc-2.12.2.so
7f9a2b4ed000-7f9a2b4f2000 rw-p 00000000 00:00 0
7f9a2b4f2000-7f9a2b507000 r-xp 00000000 08:03 9707530
/lib64/libgcc_s.so.1
7f9a2b507000-7f9a2b706000 ---p 00015000 08:03 9707530
/lib64/libgcc_s.so.1
7f9a2b706000-7f9a2b707000 r--p 00014000 08:03 9707530
/lib64/libgcc_s.so.1
7f9a2b707000-7f9a2b708000 rw-p 00015000 08:03 9707530
/lib64/libgcc_s.so.1
7f9a2b708000-7f9a2b789000 r-xp 00000000 08:03 9707527
/lib64/libm-2.12.2.so
7f9a2b789000-7f9a2b988000 ---p 00081000 08:03 9707527
/lib64/libm-2.12.2.so
7f9a2b988000-7f9a2b989000 r--p 00080000 08:03 9707527
/lib64/libm-2.12.2.so
7f9a2b989000-7f9a2b98a000 rw-p 00081000 08:03 9707527
/lib64/libm-2.12.2.so
7f9a2b98a000-7f9a2ba76000 r-xp 00000000 08:03 15385417
/usr/lib64/gcc/x86_64-pc-linux-gnu/4.5.3/libstdc++.so.6.0.14
7f9a2ba76000-7f9a2bc75000 ---p 000ec000 08:03 15385417
/usr/lib64/gcc/x86_64-pc-linux-gnu/4.5.3/libstdc++.so.6.0.14
7f9a2bc75000-7f9a2bc7d000 r--p 000eb000 08:03 15385417
/usr/lib64/gcc/x86_64-pc-linux-gnu/4.5.3/libstdc++.so.6.0.14
7f9a2bc7d000-7f9a2bc7f000 rw-p 000f3000 08:03 15385417
/usr/lib64/gcc/x86_64-pc-linux-gnu/4.5.3/libstdc++.so.6.0.14
7f9a2bc7f000-7f9a2bc94000 rw-p 00000000 00:00 0
7f9a2bc94000-7f9a2bc96000 r-xp 00000000 08:03 9707829
/lib64/libdl-2.12.2.so
7f9a2bc96000-7f9a2be96000 ---p 00002000 08:03 9707829
/lib64/libdl-2.12.2.so
7f9a2be96000-7f9a2be97000 r--p 00002000 08:03 9707829
/lib64/libdl-2.12.2.so
7f9a2be97000-7f9a2be98000 rw-p 00003000 08:03 9707829
/lib64/libdl-2.12.2.so
7f9a2be98000-7f9a2bea6000 r-xp 00000000 08:03 344819
/usr/lib64/libibverbs.so.1.0.0
7f9a2bea6000-7f9a2c0a5000 ---p 0000e000 08:03 344819
/usr/lib64/libibverbs.so.1.0.0
7f9a2c0a5000-7f9a2c0a6000 r--p 0000d000 08:03 344819
/usr/lib64/libibverbs.so.1.0.0
7f9a2c0a6000-7f9a2c0a7000 rw-p 0000e000 08:03 344819
/usr/lib64/libibverbs.so.1.0.0
7f9a2c0a7000-7f9a2c0c5000 r-xp 00000000 08:03 9707833
/lib64/ld-2.12.2.so
7f9a2c0ca000-7f9a2c2b9000 rw-p 00000000 00:00 0
7f9a2c2c2000-7f9a2c2c3000 -w-s d3803000 00:10 4351
/dev/infiniband/uverbs0
7f9a2c2c3000-7f9a2c2c4000 rw-p 00000000 00:00 0
7f9a2c2c4000-7f9a2c2c5000 r--p 0001d000 08:03 9707833
/lib64/ld-2.12.2.so
7f9a2c2c5000-7f9a2c2c6000 rw-p 0001e000 08:03 9707833
/lib64/ld-2.12.2.so
7f9a2c2c6000-7f9a2c2c7000 rw-p 00000000 00:00 0
7fffa8110000-7fffa8131000 rw-p 00000000 00:00 0
[stack]
7fffa81ff000-7fffa8200000 r-xp 00000000 00:00 0
[vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
[vsyscall]
------------- Processor 0 Exiting: Caught Signal ------------
Signal: unknown signal
[0] Stack Traceback:
[0:0] +0x35960 [0x7f9a2b1bc960]
[0:1] gsignal+0x35 [0x7f9a2b1bc8e5]
[0:2] abort+0x186 [0x7f9a2b1bdd66]
[0:3] +0x70483 [0x7f9a2b1f7483]
[0:4] __fortify_fail+0x37 [0x7f9a2b271c07]
[0:5] +0xeab99 [0x7f9a2b271b99]
[0:6] __longjmp_chk+0x33 [0x7f9a2b271b03]
[0:7] setJcontext+0x31 [0x51567a]
[0:8] swapJcontext+0x20 [0x51572a]
[9] Charm++ Runtime: Resumed thread (CthResume+0xc9 [0x515929])
Fatal error on PE 0> unknown signal




Archive powered by MHonArc 2.6.16.

Top of Page