Skip to Content.
Sympa Menu

charm - [charm] Errors when running charm++ v6.6 with obverts for the qlogic infiniband interface.

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] Errors when running charm++ v6.6 with obverts for the qlogic infiniband interface.


Chronological Thread 
  • From: "Low, John J." <jlow AT mcs.anl.gov>
  • To: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: [charm] Errors when running charm++ v6.6 with obverts for the qlogic infiniband interface.
  • Date: Wed, 7 May 2014 14:04:37 +0000
  • Accept-language: en-US
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Charm++ developers,

I have made several attempts to build charm++ on a Xeon based cluster with a QLogic QDR infiniband network.  I built charm++ with the following command:

"./build charm++ net-linux-x86_64 ibverbs icc --with-production

When I test this build with the hello command I get the following errors.

************************************************************************
Charmrun> IBVERBS version of charmrun
Charmrun> started all node programs in 0.104 seconds.
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: Failed to change qp state to RTS: you may need some device-specific parameters in machine-ibevrbs
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: Failed to change qp state to RTS: you may need some device-specific parameters in machine-ibevrbs
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: Failed to change qp state to RTS: you may need some device-specific parameters in machine-ibevrbs
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: Failed to change qp state to RTS: you may need some device-specific parameters in machine-ibevrbs
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: Failed to change qp state to RTS: you may need some device-specific parameters in machine-ibevrbs
[0] Stack Traceback:
  [0:0] CmiAbort+0x4c  [0x51d92c]
  [0:1] initInfiOtherNodeData+0x180  [0x51d100]
  [0:2]   [0x511d48]
  [0:3] ConverseInit+0x13a6  [0x51ab66]
  [0:4] main+0x57  [0x46e567]
  [0:5] __libc_start_main+0xfd  [0x34cc41ed1d]
  [0:6]   [0x46a199]
[0] Stack Traceback:
  [0:0] CmiAbort+0x4c  [0x51d92c]
  [0:1] initInfiOtherNodeData+0x180  [0x51d100]
  [0:2]   [0x511d48]
  [0:3] ConverseInit+0x13a6  [0x51ab66]
  [0:4] main+0x57  [0x46e567]
  [0:5] __libc_start_main+0xfd  [0x34cc41ed1d]
  [0:6]   [0x46a199]
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: Failed to change qp state to RTS: you may need some device-specific parameters in machine-ibevrbs
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: Failed to change qp state to RTS: you may need some device-specific parameters in machine-ibevrbs
Fatal error on PE 0> Failed to change qp state to RTS: you may need some device-specific parameters in machine-ibevrbs
************************************************************************
I am using version 13.1.3 of the intel compilers. 

Any suggestions on how to build a working ibverbs version of Charm++ for the qlogic PSM interface would be helpful.  We find that the apoa1 benchmark for namd2.9 and charmm++ over mvapich2 does not scale past a few hundred cores on this machine.  We would like to see good scaling up to a few thousand cores for NAMD.  I think having a version of charm++ with ibverbs would help.

Thanks,

John J. Low
Principal Computational Science Specialist
Computing, Environment and Life Sciences
Building 240, 2143
9700 South Cass Avenue
Argonne National Laboratory
Argonne, IL 60439.
630-252-0045
www.linkedin.com/pub/john-low/15/8b0/5aa/





Archive powered by MHonArc 2.6.16.

Top of Page