Skip to Content.
Sympa Menu

charm - Re: [charm] Charm++ tests failure

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] Charm++ tests failure


Chronological Thread 
  • From: Gengbin Zheng <zhenggb AT gmail.com>
  • To: pellegrini <pellegrini AT ill.fr>
  • Cc: Abhinav S Bhatele <bhatele AT illinois.edu>, Mark JOHNSON <johnson AT ill.fr>, charm AT cs.uiuc.edu, mudingay AT ill.fr
  • Subject: Re: [charm] Charm++ tests failure
  • Date: Tue, 15 Jun 2010 08:13:13 -0500
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

For some reason, the error shows some part of you charm is still
compiled with ibverbs.
You probably should do a fresh build of charm.
You don't need MPICXX to be defined to build a net version of charm.

when you have multiple charm builds, and when you cd to a test
directory, make sure you use the right build directory, i.e.

cd net-linux-x86_64 -pgf90-pgcc/tests/charm++/megatest,

instead of
cd tests/charm++/megatest

to avoid mixing different builds of charm.

Gengbin

On Tue, Jun 15, 2010 at 5:32 AM, pellegrini
<pellegrini AT ill.fr>
wrote:
> Hi Abhimav,
>
> I tried to rebuild charm++ by removing the ibverbs flag using this time
> the following instructions:
>
> 1 cd charm-6.1.3
> 2 env MPICXX=mpicxx ./build charm++ net-linux-x86_64 pgcc
> --basedir=/usr/pgi/linux86-64/2010/mpi/mvapich --no-build-shared pgf90
> --no-build-shared -j32 -O2 -DCMK_OTIMIZE
> 3 cd tests/charm++/megatests
> 4 make clean
> 5 make pgm
> 6 charmrun ++local ++verbose ./pgm
>
> now we get the following error:
>
> master charm++/megatest> charmrun ++local ++verbose ./pgm
> Charmrun> charmrun started...
> Charmrun> adding client 0: "127.0.0.1", IP:127.0.0.1
> Charmrun> Charmrun = 127.0.0.1, port = 59250
> Charmrun> IBVERBS version of charmrun
> Charmrun> start 0 node program on localhost.
> Charmrun> node programs all started
> Charmrun> Waiting for 0-th client to connect.
> charmrun: charmrun.c:1958: req_one_client_partinit: Assertion
> `(__extension__ (__builtin_constant_p (8) && ((__builtin_constant_p
> (partStartMsg.header.type) && strlen (partStartMsg.header.type) <
> ((size_t) (8))) || (__builtin_constant_p ("partinit") && strlen
> ("partinit") < ((size_t) (8)))) ? __extension__ ({ size_t __s1_len,
> __s2_len; (__builtin_constant_p (partStartMsg.header.type) &&
> __builtin_constant_p ("partinit") && (__s1_len = strlen
> (partStartMsg.header.type), __s2_len = strlen ("partinit"),
> (!((size_t)(const void *)((partStartMsg.header.type) + 1) -
> (size_t)(const void *)(partStartMsg.header.type) == 1) || __s1_len >= 4)
> && (!((size_t)(const void *)(("partinit") + 1) - (size_t)(const void
> *)("partinit") == 1) || __s2_len >= 4)) ? __builtin_strcmp
> (partStartMsg.header.type, "partinit") : (__builtin_constant_p
> (partStartMsg.header.type) && ((size_t)(const void
> *)((partStartMsg.header.type) + 1) - (size_t)(const void
> *)(partStartMsg.header.type) == 1) && (__s1_len = strlen
> (partStartMsg.header.type), __s1_len < 4) ? (__builtin_constant_p
> ("partinit") && ((size_t)(const void *)(("partinit") + 1) -
> (size_t)(const void *)("partinit") == 1) ? __builtin_strcmp
> (partStartMsg.header.type, "partinit") : (__extension__ ({ __const
> unsigned char *__s2 = (__const unsigned char *) (__const char *)
> ("partinit"); register int __result = (((__const unsigned char *)
> (__const char *) (partStartMsg.header.type))[0] - __s2[0]); if (__s1_len
>  > 0 && __result == 0) { __result = (((__const unsigned char *) (__const
> char *) (partStartMsg.header.type))[1] - __s2[1]); if (__s1_len > 1 &&
> __result == 0) { __result = (((__const unsigned char *) (__const char *)
> (partStartMsg.header.type))[2] - __s2[2]); if (__s1_len > 2 && __result
> == 0) __result = (((__const unsigned char *) (__const char *)
> (partStartMsg.header.type))[3] - __s2[3]); } } __result; }))) :
> (__builtin_constant_p ("partinit") && ((size_t)(const void
> *)(("partinit") + 1) - (size_t)(const void *)("partinit") == 1) &&
> (__s2_len = strlen ("partinit"), __s2_len < 4) ? (__builtin_constant_p
> (partStartMsg.header.type) && ((size_t)(const void
> *)((partStartMsg.header.type) + 1) - (size_t)(const void
> *)(partStartMsg.header.type) == 1) ? __builtin_strcmp
> (partStartMsg.header.type, "partinit") : (__extension__ ({ __const
> unsigned char *__s1 = (__const unsigned char *) (__const char *)
> (partStartMsg.header.type); register int __result = __s1[0] - ((__const
> unsigned char *) (__const char *) ("partinit"))[0]; if (__s2_len > 0 &&
> __result == 0) { __result = (__s1[1] - ((__const unsigned char *)
> (__const char *) ("partinit"))[1]); if (__s2_len > 1 && __result == 0) {
> __result = (__s1[2] - ((__const unsigned char *) (__const char *)
> ("partinit"))[2]); if (__s2_len > 2 && __result == 0) __result =
> (__s1[3] - ((__const unsigned char *) (__const char *)
> ("partinit"))[3]); } } __result; }))) : __builtin_strcmp
> (partStartMsg.header.type, "partinit")))); }) : strncmp
> (partStartMsg.header.type, "partinit", 8))) == 0' failed.
> Abort
>
> does it tell you something ?
>
> thank you very much for your support
>
> Eric
>
>
>
> Abhinav S Bhatele a écrit :
>> Hello Eric,
>>
>> The ibverbs issue you pointed out might take some time before it gets
>> resolved.
>>
>> In the meantime, if you want to get NAMD up and running, I would
>> recommend removing
>> ibverbs from your build line (at step 2) and compiling afresh.
>>
>> - Abhinav
>>
>>
>> On Thu, Jun 10, 2010 at 11:11 AM, pellegrini
>> <pellegrini AT ill.fr
>> <mailto:pellegrini AT ill.fr>>
>> wrote:
>>
>>     Hello,
>>
>>     we are trying to install NAMD on our new DALCO cluster made of 43
>>     8-cores Intel Xeon cpus (Nehalem) connected through InfiniBand. We
>>     followed the installation notes but we fails to make charm++ work. The
>>     build was successful but when performing some tests we got no results.
>>     We are using charm-6.1.3 that comes along
>>     with NAMD-2.7b2. We did the following from the NAMD directory:
>>
>>     1 cd charm-6.1.3
>>     2 env MPICXX=mpicxx ./build charm++ net-linux-x86_64 pgcc
>>     --basedir=/usr/pgi/linux86-64/2010/mpi/mvapich --no-build-shared pgf90
>>     ibverbs --no-build-shared -j32 -O2 -DCMK_OTIMIZE --> the build was
>>     successful
>>     3 cd tests/charm++/megatests
>>     4 make clean
>>     5 make pgm --> the make was successfull
>>     6 charmrun ++local +p4 ./pgm and from here we get:
>>
>>        Charmrun> IBVERBS version of charmrun
>>        Charmrun: Bad initnode data length. Aborting
>>
>>     Based on this, the build of NAMD works but once again the tests are
>>     failing. We think that both failures are related.
>>
>>     Would you have any idea of what is going wrong or if we did some
>>     mistakes when building charm++ ?
>>
>>     thank you very much
>>
>>     best regards
>>
>>     Eric Pellegrini
>>
>>
>>
>>
>>     _______________________________________________
>>     charm mailing list
>>    
>> charm AT cs.uiuc.edu
>>
>> <mailto:charm AT cs.uiuc.edu>
>>     http://lists.cs.uiuc.edu/mailman/listinfo/charm
>>
>>
>>
>>
>> --
>> Abhinav S Bhatele
>> My work: http://charm.cs.illinois.edu/~bhatele/phd
>> <http://charm.cs.illinois.edu/%7Ebhatele/phd>
>
>
> _______________________________________________
> charm mailing list
> charm AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/charm
>
>





Archive powered by MHonArc 2.6.16.

Top of Page