Skip to Content.
Sympa Menu

charm - [charm] how to run SimpleArrayHello on multiple PCs

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] how to run SimpleArrayHello on multiple PCs


Chronological Thread 
  • From: Chiara Orsini <chiara.orsini AT iet.unipi.it>
  • To: charm <charm AT cs.uiuc.edu>
  • Subject: [charm] how to run SimpleArrayHello on multiple PCs
  • Date: Tue, 29 Jun 2010 16:10:07 +0200
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Dear charm++ users,


I tried to run SimpleArrayHello program on multiple PCs, but the program ends with this error:

Charmrun> node programs all connected
Charmrun: error on request socket--
Socket closed before recv.


I read the FAQ, but linking with -memory paranoid did not give me any useful information.

Consider two host A and B. I configured host A and B in order to be accessible with each other through ssh (without password). 

I can run BasicHelloWorld on host A (launching it from host A) successfully.
I can run BasicHelloWorldon host B (launching it from host A) successfully.

But, I cannot run BasicHelloWorld on host A and B (launching it from host A). This is my nodelist file:

group main ++shell ssh
host IP_B
host IP_A

This is the output that I obtain running this command:  ./charmrun PROGRAM_NAME +p2 ++verbose 

Charmrun> charmrun started...
Charmrun> using ./nodelist as nodesfile
Charmrun> adding client 0: "IP_B", IP:IP_B
Charmrun> adding client 1: "IP_A", IP:IP_A
Charmrun> Charmrun = IP_A, port = 50501
Charmrun> Sending "0 IP_A 50501 2547 0" to client 0.
Charmrun> find the node program "/PATH/PROGRAM_NAME/prefix" at "/PATH/PROGRAM_NAME" for 0.
Charmrun> node 0: xterm is xterm
Charmrun> Starting ssh IP_B -l user /bin/sh -f
Charmrun> remote shell (IP_B:0) started
Charmrun> Sending "1 IP_A 50501 2547 0" to client 1.
Charmrun> find the node program "/PATH/PROGRAM_NAME/prefix" at "/PATH/PROGRAM_NAME" for 1.
Charmrun> node 1: xterm is xterm
Charmrun> Starting ssh IP_A -l user /bin/sh -f
Charmrun> remote shell (IP_A:1) started
Charmrun> node programs all started
Charmrun remote shell(IP_A.1)> remote responding...
Charmrun remote shell(IP_A.1)> using xterm /usr/X11R6/bin/xterm
Charmrun remote shell(IP_A.1)> starting node-program...
Charmrun remote shell(IP_A.1)> rsh phase successful.
Charmrun remote shell(IP_B.0)> remote responding...
Charmrun remote shell(IP_B.0)> using xterm /usr/X11R6/bin/xterm
Charmrun remote shell(IP_B.0)> starting node-program...
Charmrun remote shell(IP_B.0)> rsh phase successful.
Charmrun> Waiting for 0-th client to connect.
Charmrun> Waiting for 1-th client to connect.
Charmrun> client 1 connected (IP=IP_A data_port=54659)
Charmrun> client 0 connected (IP=IP_B data_port=34817)
Charmrun> All clients connected.
Charmrun> IP tables sent.
Charmrun> node programs all connected
Charmrun: error on request socket--
Socket closed before recv.


Could anyone explain how to solve this problem?

Any information you'll give me will be surely appreciated.
Thank you in advance.

Best regards,

Chiara








Archive powered by MHonArc 2.6.16.

Top of Page