Charm++ parallel programming system

Text archives Help


Re: [charm] Charm++ on a Cray XE6


Chronological Thread 
  • From: Jeffrey Poznanovic <poznanovic AT cscs.ch>
  • To: <gzheng AT illinois.edu>
  • Cc: charm AT cs.uiuc.edu
  • Subject: Re: [charm] Charm++ on a Cray XE6
  • Date: Wed, 13 Oct 2010 17:49:26 +0200
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Hello Abhinav and Gengbin, 


Ah yes, the version number and output are always useful.  :)

Looks like I have charm-6.2.2


Here's the output from a 4 process "megatest":


Charm++> Running on MPI version: 2.2 multi-thread support: 0 (max supported: -1)
------------- Processor 0 Exiting: Caught Signal ------------
Signal: 11
------------- Processor 2 Exiting: Caught Signal ------------
Signal: 11
------------- Processor 1 Exiting: Caught Signal ------------
Signal: 11
------------- Processor 3 Exiting: Caught Signal ------------
Signal: 11
[0] Stack Traceback:
  [0:0] [0x55e770]
  [0:1] [0x4b5cac]
  [0:2] [0x4582a5]
  [0:3] [0x4a5093]
  [0:4] [0x462b8d]
  [0:5] [0x5bed30]
  [0:6] [0x400209]
[2] Stack Traceback:
  [2:0] [0x55e770]
  [2:1] [0x4b5cac]
  [2:2] [0x4582a5]
  [2:3] [0x4a5093]
  [2:4] [0x462b8d]
  [2:5] [0x5bed30]
  [2:6] [0x400209]
[1] Stack Traceback:
  [1:0] [0x55e770]
  [1:1] [0x4b5cac]
  [1:2] [0x4582a5]
  [1:3] [0x4a5093]
  [1:4] [0x462b8d]
  [1:5] [0x5bed30]
  [1:6] [0x400209]
[3] Stack Traceback:
  [3:0] [0x55e770]
  [3:1] [0x4b5cac]
  [3:2] [0x4582a5]
  [3:3] [0x4a5093]
  [3:4] [0x462b8d]
  [3:5] [0x5bed30]
  [3:6] [0x400209]
Rank 3 [Wed Oct 13 11:29:35 2010] [c0-0c1s0n3] application called MPI_Abort(MPI_
COMM_WORLD, 1) - process 3
Rank 2 [Wed Oct 13 11:29:35 2010] [c0-0c1s0n3] application called MPI_Abort(MPI_
COMM_WORLD, 1) - process 2
Rank 0 [Wed Oct 13 11:29:35 2010] [c0-0c1s0n3] application called MPI_Abort(MPI_
COMM_WORLD, 1) - process 0
Rank 1 [Wed Oct 13 11:29:35 2010] [c0-0c1s0n3] application called MPI_Abort(MPI_
COMM_WORLD, 1) - process 1
[NID 00033] 2010-10-13 11:29:35 Apid 61790: initiated application termination
Application 61790 exit codes: 1
Application 61790 resources: utime ~0s, stime ~0s





On Oct 13, 2010, at 5:34 PM, Gengbin Zheng wrote:


Hi Jeff,

 Can you send us the error output from running "megatest"? If there is no output, or just a couple lines about charm collecting topology, then we may have already a fix (like Abhinav said in his email) in the development branch.
 
Gengbin

On Wed, Oct 13, 2010 at 5:09 AM, Jeffrey Poznanovic <poznanovic AT cscs.ch> wrote:
Hello Charm-ers, 

Do you have any advice on how to configure Charm++ to build for a Cray XE6 target?  

I "successfully" built Charm++ using the pre-defined "mpi-crayxt" arch on the XE6.  Then, I attempted to test it with the included "megatest", but a simple 4-proc job fails (with no useful error messages).  We have been able to build Charm++ for our XT5 without problems using the "mpi-crayxt" arch.

The main differences between the XT5 and XE6 are the interconnect and processors.  If you are interested, see this short blurb for some basic info about the system:  http://www.cscs.ch/489.0.html?&tx_ttnews[tt_news]=29&tx_ttnews[backPid]=488&cHash=31ce85ee7b

Which files and parameters should I focus on changing?  I'm assuming that I should copy the arch/mpi-crayxt directory and probably modify the "conv-mach" files, but any insight that you can provide will be very helpful.  I'll be happy to test it and report back the results.

Best regards, 

Jeff




--
Jeff Poznanovic
Scientific Computing Specialist
CSCS - Swiss National Supercomputing Center
Email: poznanovic AT cscs.ch


_______________________________________________
charm mailing list
charm AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/charm




--
Jeff Poznanovic
Scientific Computing Specialist
CSCS - Swiss National Supercomputing Center
Email: poznanovic AT cscs.ch




Archive powered by MHonArc 2.6.16.

Top of page