Skip to Content.
Sympa Menu

charm - [charm] MIgration error With AMPI + ibverbs

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] MIgration error With AMPI + ibverbs


Chronological Thread 
  • From: Rafael Keller Tesser <rafael.tesser AT inf.ufrgs.br>
  • To: charm AT cs.uiuc.edu
  • Subject: [charm] MIgration error With AMPI + ibverbs
  • Date: Thu, 21 Mar 2013 09:38:14 -0000
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Hello,

I ported a geophysics application to AMPI, in order to experiment with
its load balancing features.

Without load-balancing the application runs without any error, on both
Gigabit Ethernet and Infiniband. With load-balancing, the application
runs fine on Gigabit Ethernet.
With the IBVERBS version of Charm, however, I am getting the following
error, during the first load-balancing step:

--
...
CharmLB> GreedyLB: PE [0] Memory: LBManager: 921 KB CentralLB: 87 KB
CharmLB> GreedyLB: PE [0] #Objects migrating: 247, LBMigrateMsg size: 0.02 MB
CharmLB> GreedyLB: PE [0] strategy finished at 55.669918 duration 0.007592 s
[0] Starting ReceiveMigration step 0 at 55.672409
Charmrun: error on request socket--
Socket closed before recv.

--

I send the full output in a file attached to this message (output.txt).

The error also happens with the AMPI migration test program that comes
with charm++ (located in tests/ampi/migration). The outputs are
attached to this message.

I get this error both with Charm-6.4.0 and with the development
version from the Git repository.

AMPI was built with:
./build charm++ net-linux-x86_64 ibverbs --with-production -j16
./build AMPI net-linux-x86_64 ibverbs --with-production -j16


Do you have any ideas on what may be causing this error?

--
Best regards,
Rafael Keller Tesser

GPPD - Grupo de Processamento Paralelo e DistribuĂ­do
Instituto de Informática / UFRGS
Charmrun> charmrun started...
Charmrun> using ./nodelist as nodesfile
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> remote shell (edel-21.grenoble.grid5000.fr:0) started
Charmrun> remote shell (edel-23.grenoble.grid5000.fr:1) started
Charmrun> remote shell (edel-26.grenoble.grid5000.fr:2) started
Charmrun> remote shell (edel-9.grenoble.grid5000.fr:3) started
Charmrun> remote shell (edel-21.grenoble.grid5000.fr:4) started
Charmrun> adding client 0: "edel-21.grenoble.grid5000.fr", IP:172.16.16.71
Charmrun> adding client 1: "edel-23.grenoble.grid5000.fr", IP:172.16.16.73
Charmrun> adding client 2: "edel-26.grenoble.grid5000.fr", IP:172.16.16.76
Charmrun> adding client 3: "edel-9.grenoble.grid5000.fr", IP:172.16.16.59
Charmrun> adding client 4: "edel-21.grenoble.grid5000.fr", IP:172.16.16.71
Charmrun> adding client 5: "edel-23.grenoble.grid5000.fr", IP:172.16.16.73
Charmrun> adding client 6: "edel-26.grenoble.grid5000.fr", IP:172.16.16.76
Charmrun> adding client 7: "edel-9.grenoble.grid5000.fr", IP:172.16.16.59
Charmrun> adding client 8: "edel-21.grenoble.grid5000.fr", IP:172.16.16.71
Charmrun> adding client 9: "edel-23.grenoble.grid5000.fr", IP:172.16.16.73
Charmrun> adding client 10: "edel-26.grenoble.grid5000.fr", IP:172.16.16.76
Charmrun> adding client 11: "edel-9.grenoble.grid5000.fr", IP:172.16.16.59
Charmrun> adding client 12: "edel-21.grenoble.grid5000.fr", IP:172.16.16.71
Charmrun> adding client 13: "edel-23.grenoble.grid5000.fr", IP:172.16.16.73
Charmrun> adding client 14: "edel-26.grenoble.grid5000.fr", IP:172.16.16.76
Charmrun> adding client 15: "edel-9.grenoble.grid5000.fr", IP:172.16.16.59
Charmrun> adding client 16: "edel-21.grenoble.grid5000.fr", IP:172.16.16.71
Charmrun> adding client 17: "edel-23.grenoble.grid5000.fr", IP:172.16.16.73
Charmrun> adding client 18: "edel-26.grenoble.grid5000.fr", IP:172.16.16.76
Charmrun> adding client 19: "edel-9.grenoble.grid5000.fr", IP:172.16.16.59
Charmrun> adding client 20: "edel-21.grenoble.grid5000.fr", IP:172.16.16.71
Charmrun> adding client 21: "edel-23.grenoble.grid5000.fr", IP:172.16.16.73
Charmrun> adding client 22: "edel-26.grenoble.grid5000.fr", IP:172.16.16.76
Charmrun> adding client 23: "edel-9.grenoble.grid5000.fr", IP:172.16.16.59
Charmrun> adding client 24: "edel-21.grenoble.grid5000.fr", IP:172.16.16.71
Charmrun> adding client 25: "edel-23.grenoble.grid5000.fr", IP:172.16.16.73
Charmrun> adding client 26: "edel-26.grenoble.grid5000.fr", IP:172.16.16.76
Charmrun> adding client 27: "edel-9.grenoble.grid5000.fr", IP:172.16.16.59
Charmrun> adding client 28: "edel-21.grenoble.grid5000.fr", IP:172.16.16.71
Charmrun> adding client 29: "edel-23.grenoble.grid5000.fr", IP:172.16.16.73
Charmrun> adding client 30: "edel-26.grenoble.grid5000.fr", IP:172.16.16.76
Charmrun> adding client 31: "edel-9.grenoble.grid5000.fr", IP:172.16.16.59
Charmrun> Charmrun = 172.16.16.71, port = 50371
Charmrun> IBVERBS version of charmrun
start_nodes_rsh
Charmrun> Sending "0 172.16.16.71 50371 6683 0" to client 0.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 0.
Charmrun> Starting oarsh edel-21.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "1 172.16.16.71 50371 6683 0" to client 1.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 1.
Charmrun> Starting oarsh edel-23.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "2 172.16.16.71 50371 6683 0" to client 2.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 2.
Charmrun> Starting oarsh edel-26.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "3 172.16.16.71 50371 6683 0" to client 3.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 3.
Charmrun> Starting oarsh edel-9.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "4 172.16.16.71 50371 6683 0" to client 4.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 4.
Charmrun> Starting oarsh edel-21.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "5 172.16.16.71 50371 6683 0" to client 5.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtesCharmrun> remote shell
(edel-23.grenoble.grid5000.fr:5) started
Charmrun> remote shell (edel-26.grenoble.grid5000.fr:6) started
Charmrun> remote shell (edel-9.grenoble.grid5000.fr:7) started
Charmrun> remote shell (edel-21.grenoble.grid5000.fr:8) started
Charmrun> remote shell (edel-23.grenoble.grid5000.fr:9) started
Charmrun> remote shell (edel-26.grenoble.grid5000.fr:10) started
Charmrun> remote shell (edel-9.grenoble.grid5000.fr:11) started
Charmrun> remote shell (edel-21.grenoble.grid5000.fr:12) started
Charmrun> remote shell (edel-23.grenoble.grid5000.fr:13) started
Charmrun> remote shell (edel-26.grenoble.grid5000.fr:14) started
Charmrun> remote shell (edel-9.grenoble.grid5000.fr:15) started
Charmrun> remote shell (edel-21.grenoble.grid5000.fr:16) started
Charmrun> remote shell (edel-23.grenoble.grid5000.fr:17) started
Charmrun> remote shell (edel-26.grenoble.grid5000.fr:18) started
Charmrun> remote shell (edel-9.grenoble.grid5000.fr:19) started
t" for 5.
Charmrun> Starting oarsh edel-23.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "6 172.16.16.71 50371 6683 0" to client 6.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 6.
Charmrun> Starting oarsh edel-26.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "7 172.16.16.71 50371 6683 0" to client 7.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 7.
Charmrun> Starting oarsh edel-9.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "8 172.16.16.71 50371 6683 0" to client 8.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 8.
Charmrun> Starting oarsh edel-21.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "9 172.16.16.71 50371 6683 0" to client 9.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 9.
Charmrun> Starting oarsh edel-23.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "10 172.16.16.71 50371 6683 0" to client 10.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 10.
Charmrun> Starting oarsh edel-26.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "11 172.16.16.71 50371 6683 0" to client 11.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 11.
Charmrun> Starting oarsh edel-9.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "12 172.16.16.71 50371 6683 0" to client 12.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 12.
Charmrun> Starting oarsh edel-21.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "13 172.16.16.71 50371 6683 0" to client 13.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 13.
Charmrun> Starting oarsh edel-23.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "14 172.16.16.71 50371 6683 0" to client 14.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 14.
Charmrun> Starting oarsh edel-26.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "15 172.16.16.71 50371 6683 0" to client 15.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 15.
Charmrun> Starting oarsh edel-9.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "16 172.16.16.71 50371 6683 0" to client 16.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 16.
Charmrun> Starting oarsh edel-21.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "17 172.16.16.71 50371 6683 0" to client 17.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 17.
Charmrun> Starting oarsh edel-23.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "18 172.16.16.71 50371 6683 0" to client 18.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 18.
Charmrun> Starting oarsh edel-26.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "19 172.16.16.71 50371 6683 0" to client 19.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 19.
Charmrun> Starting oarsh edel-9.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "20 172.16.16.71 50371 6683 0" to client 20.
Charmrun> find the node progCharmrun> remote shell
(edel-21.grenoble.grid5000.fr:20) started
Charmrun> remote shell (edel-23.grenoble.grid5000.fr:21) started
Charmrun> remote shell (edel-26.grenoble.grid5000.fr:22) started
Charmrun> remote shell (edel-9.grenoble.grid5000.fr:23) started
Charmrun> remote shell (edel-21.grenoble.grid5000.fr:24) started
Charmrun> remote shell (edel-23.grenoble.grid5000.fr:25) started
Charmrun> remote shell (edel-26.grenoble.grid5000.fr:26) started
Charmrun> remote shell (edel-9.grenoble.grid5000.fr:27) started
Charmrun> remote shell (edel-21.grenoble.grid5000.fr:28) started
Charmrun> remote shell (edel-23.grenoble.grid5000.fr:29) started
Charmrun> remote shell (edel-26.grenoble.grid5000.fr:30) started
Charmrun> remote shell (edel-9.grenoble.grid5000.fr:31) started
Charmrun> node programs all started
Charmrun remote shell(edel-23.grenoble.grid5000.fr.13)> remote responding...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.13)> starting
node-program...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.13)> rsh phase successful.
Charmrun remote shell(edel-26.grenoble.grid5000.fr.2)> remote responding...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.2)> starting
node-program...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.2)> rsh phase successful.
Charmrun remote shell(edel-21.grenoble.grid5000.fr.4)> remote responding...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.18)> remote responding...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.4)> starting
node-program...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.4)> rsh phase successful.
Charmrun remote shell(edel-26.grenoble.grid5000.fr.18)> starting
node-program...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.18)> rsh phase successful.
Charmrun remote shell(edel-21.grenoble.grid5000.fr.0)> remote responding...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.0)> starting
node-program...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.0)> rsh phase successful.
Charmrun remote shell(edel-21.grenoble.grid5000.fr.8)> remote responding...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.8)> starting
node-program...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.8)> rsh phase successful.
Charmrun remote shell(edel-26.grenoble.grid5000.fr.10)> remote responding...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.10)> starting
node-program...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.10)> rsh phase successful.
Charmrun remote shell(edel-23.grenoble.grid5000.fr.9)> remote responding...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.9)> starting
node-program...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.9)> rsh phase successful.
Charmrun remote shell(edel-9.grenoble.grid5000.fr.7)> remote responding...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.7)> starting node-program...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.7)> rsh phase successful.
Charmrun remote shell(edel-26.grenoble.grid5000.fr.6)> remote responding...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.16)> remote responding...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.6)> starting
node-program...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.6)> rsh phase successful.
Charmrun remote shell(edel-21.grenoble.grid5000.fr.16)> starting
node-program...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.16)> rsh phase successful.
Charmrun remote shell(edel-9.grenoble.grid5000.fr.3)> remote responding...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.3)> starting node-program...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.3)> rsh phase successful.
Charmrun remote shell(edel-23.grenoble.grid5000.fr.29)> remote responding...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.29)> starting
node-program...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.29)> rsh phase successful.
Charmrun remote shell(edel-9.grenoble.grid5000.fr.11)> remote responding...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.11)> starting
node-program...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.11)> rsh phase successful.
Charmrun remote shell(edel-26.grenoble.grid5000.fr.26)> remote responding...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.26)> starting
node-program...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.26)> rsh phase successful.
Charmrun remote shell(edel-21.grenoble.grid5000.fr.12)> remote responding...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.12)> starting
node-program...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.12)> rsh phase successful.
Charmrun remote shell(edel-9.grenoble.grid5000.fr.27)> remote responding...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.27)> starting
node-program...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.27)> rsh phase successful.
Charmrun remote shell(edel-23.grenoble.grid5000.fr.1)> remote responding...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.1)> starting
node-program...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.1)> rsh phase successful.
Charmrun remote shell(edel-9.grenoble.grid5000.fr.19)> remote responding...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.30)> remote responding...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.19)> starting
node-program...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.19)> rsh phase successful.
Charmrun remote shell(edel-26.grenoble.grid5000.fr.30)> starting
node-program...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.30)> rsh phase successful.
Charmrun remote shell(edel-21.grenoble.grid5000.fr.24)> remote responding...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.24)> starting
node-program...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.24)> rsh phase successful.
Charmrun remote shell(edel-9.grenoble.grid5000.fr.31)> remote responding...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.31)> starting
node-program...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.31)> rsh phase successful.
Charmrun remote shell(edel-9.grenoble.grid5000.fr.15)> remote responding...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.25)> remote responding...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.15)> starting
node-program...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.15)> rsh phase successful.
Charmrun remote shell(edel-23.grenoble.grid5000.fr.25)> starting
node-program...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.25)> rsh phase successful.
Charmrun remote shell(edel-26.grenoble.grid5000.fr.22)> remote responding...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.22)> starting
node-program...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.22)> rsh phase successful.
Charmrun remote shell(edel-21.grenoble.grid5000.fr.28)> remote responding...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.28)> starting
node-program...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.28)> rsh phase successful.
Charmrun remote shell(edel-9.grenoble.grid5000.fr.23)> remote responding...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.23)> starting
node-program...
Charmrun remote shell(edel-9.grenoble.grid5000.fr.23)> rsh phase successful.
Charmrun remote shell(edel-23.grenoble.grid5000.fr.17)> remote responding...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.17)> starting
node-program...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.17)> rsh phase successful.
Charmrun remote shell(edel-26.grenoble.grid5000.fr.14)> remote responding...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.14)> starting
node-program...
Charmrun remote shell(edel-26.grenoble.grid5000.fr.14)> rsh phase successful.
Charmrun remote shell(edel-21.grenoble.grid5000.fr.20)> remote responding...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.20)> starting
node-program...
Charmrun remote shell(edel-21.grenoble.grid5000.fr.20)> rsh phase successful.
Charmrun remote shell(edel-23.grenoble.grid5000.fr.21)> remote responding...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.21)> starting
node-program...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.21)> rsh phase successful.
Charmrun remote shell(edel-23.grenoble.grid5000.fr.5)> remote responding...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.5)> starting
node-program...
Charmrun remote shell(edel-23.grenoble.grid5000.fr.5)> rsh phase successful.
ram "/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 20.
Charmrun> Starting oarsh edel-21.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "21 172.16.16.71 50371 6683 0" to client 21.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 21.
Charmrun> Starting oarsh edel-23.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "22 172.16.16.71 50371 6683 0" to client 22.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 22.
Charmrun> Starting oarsh edel-26.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "23 172.16.16.71 50371 6683 0" to client 23.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 23.
Charmrun> Starting oarsh edel-9.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "24 172.16.16.71 50371 6683 0" to client 24.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 24.
Charmrun> Starting oarsh edel-21.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "25 172.16.16.71 50371 6683 0" to client 25.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 25.
Charmrun> Starting oarsh edel-23.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "26 172.16.16.71 50371 6683 0" to client 26.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 26.
Charmrun> Starting oarsh edel-26.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "27 172.16.16.71 50371 6683 0" to client 27.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 27.
Charmrun> Starting oarsh edel-9.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "28 172.16.16.71 50371 6683 0" to client 28.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 28.
Charmrun> Starting oarsh edel-21.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "29 172.16.16.71 50371 6683 0" to client 29.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 29.
Charmrun> Starting oarsh edel-23.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "30 172.16.16.71 50371 6683 0" to client 30.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 30.
Charmrun> Starting oarsh edel-26.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "31 172.16.16.71 50371 6683 0" to client 31.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest/ondes3d" at
"/home/rktesser/Ondes3d-eval/Ondes3d-ibtest" for 31.
Charmrun> Starting oarsh edel-9.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Waiting for 0-th client to connect.
Charmrun> Waiting for 1-th client to connect.
Charmrun> Waiting for 2-th client to connect.
Charmrun> Waiting for 3-th client to connect.
Charmrun> Waiting for 4-th client to connect.
Charmrun> Waiting for 5-th client to connect.
Charmrun> Waiting for 6-th client to connect.
Charmrun> Waiting for 7-th client to connect.
Charmrun> Waiting for 8-th client to connect.
Charmrun> Waiting for 9-th client to connect.
Charmrun> Waiting for 10-th client to connect.
Charmrun> Waiting for 11-th client to connect.
Charmrun> Waiting for 12-th client to connect.
Charmrun> Waiting for 13-th client to connect.
Charmrun> Waiting for 14-th client to connect.
Charmrun> Waiting for 15-th client to connect.
Charmrun> Waiting for 16-th client to connect.
Charmrun> Waiting for 17-th cCharmrun> node programs all connected
Charmrun> started all node programs in 3.721 seconds.
lient to connect.
Charmrun> Waiting for 18-th client to connect.
Charmrun> Waiting for 19-th client to connect.
Charmrun> Waiting for 20-th client to connect.
Charmrun> Waiting for 21-th client to connect.
Charmrun> Waiting for 22-th client to connect.
Charmrun> Waiting for 23-th client to connect.
Charmrun> Waiting for 24-th client to connect.
Charmrun> Waiting for 25-th client to connect.
Charmrun> Waiting for 26-th client to connect.
Charmrun> Waiting for 27-th client to connect.
Charmrun> Waiting for 28-th client to connect.
Charmrun> Waiting for 29-th client to connect.
Charmrun> Waiting for 30-th client to connect.
Charmrun> Waiting for 31-th client to connect.
Charmrun> All clients connected.
Charmrun> IP tables sent.
Converse/Charm++ Commit ID:
Charm++> scheduler running in netpoll mode.
CharmLB> Verbose level 2, load balancing period: 0.5 seconds
CharmLB> Topology torus_nd_5 alpha: 3.500000e-05s beta: 8.500000e-09s.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 4 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.018 seconds.
[0] GreedyLB created
USING BLOCK_MAP
## LAYER medium representation ##
## FREE SURFACE on top ##
## CPML absorbing layers ##
## Elastic medium ##

Dimension of FDM order ... 4

Parameter File ... ./DATA/essai.prm
Source Model based on ... ./DATA/essai.map
Rupture History from ... ./DATA/essai.hist
Station Position at ... ./DATA/station.map
Output directory ... ./Essai/

spatial grid DS = 100.000000[m]
time step dt = 0.008000[s]

Visualisation of plane (y,z) at X = -2.00 [km]
Visualisation of plane (x,z) at Y = 0.00 [km]
Visualisation of plane (x,y) at Z = -0.10 [km]

Model Region (-300:300, -300:300, -200:0)
( -30.00: 30.00, -30.00: 30.00, -20.00: 0.00) [km]

CPML absorbing boundary, dumping 0.010362, width 10, ratio 0.001000,
frequency 1.000000

structure model
depth[km] vp vs rho
0.00 6000.00 3460.00 2700.00

NUMBER OF SOURCE 1
Hypocenter ... (-2000.000000, 0.000000, -4000.000000)
.............. (-19, 1, -39)
Source 1 .... (-2000.000000, 0.000000, -4000.000000)
.............. (-19, 1, -39)

Source duration 3.992000 sec
fault segment 1.000000 m, 0.008000 s
1 ( -19 1 -39 ) : 50.000000 20.000000 30.000000
Mw = 5.933320; Mo = 9.999550e+17 [N m]

Stations coordinates :

Number of points in the CPML : 187490

CharmLB> GreedyLB: PE [0] step 0 starting at 57.117580 Memory: 667.598740 MB
CharmLB> GreedyLB: PE [0] strategy starting at 57.760158
[0] In GreedyLB strategy
[0] 241 objects migrating.
CharmLB> Min obj: 3.794199 Max obj: 7.347045
CharmLB> PE speed:
1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
CharmLB> PE Load:
34.625787 (0.014718) 34.627293 (0.015379) 34.630942 (0.014127) 34.627532
(0.014945) 34.629892 (0.015443) 34.637683 (0.017478) 34.638207 (0.013936)
34.637003 (0.015647) 34.629083 (0.014560) 34.636479 (0.016448) 34.637139
(0.022019) 36.095877 (0.011574) 34.638828 (0.017856) 35.830960 (0.005400)
36.096840 (0.007085) 34.639563 (0.017979) 35.804287 (0.010712) 36.092994
(0.011396) 35.804507 (0.011372) 34.637205 (0.016912) 36.096706 (0.009483)
36.086792 (0.010292) 34.663272 (0.020654) 36.101812 (0.013037) 36.098300
(0.013247) 34.639992 (0.013322) 36.101129 (0.011877) 36.096732 (0.013090)
36.098290 (0.008623) 36.098129 (0.009518) 36.097679 (0.009355) 36.085479
(0.009732)
CharmLB> GreedyLB: PE [0] Memory: LBManager: 921 KB CentralLB: 87 KB
CharmLB> GreedyLB: PE [0] #Objects migrating: 241, LBMigrateMsg size: 0.02 MB
CharmLB> GreedyLB: PE [0] strategy finished at 57.762843 duration 0.002685 s
[0] Starting ReceiveMigration step 0 at 57.763744
Charmrun: error on request socket--
Socket closed before recv.
./charmrun ./pgm +p1 +vp1 +x1 +y1 +z1 ++remote-shell oarsh
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> started all node programs in 1.288 seconds.
Charmrun> IBVERBS version of charmrun
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.002 seconds.

begin migrating
Done with step 0
Done with step 1
done migrating
All tests passed
./charmrun ./pgm +p1 +vp2 +x1 +y1 +z1 ++remote-shell oarsh
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> started all node programs in 1.247 seconds.
Charmrun> IBVERBS version of charmrun
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.003 seconds.

begin migrating

begin migrating
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0
Done with step 0
Done with step 0
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0
Done with step 1
Done with step 1
done migrating
done migrating
All tests passed
./charmrun ./pgm +p1 +vp4 +x1 +y1 +z1 ++remote-shell oarsh
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> started all node programs in 1.285 seconds.
Charmrun> IBVERBS version of charmrun
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.002 seconds.

begin migrating

begin migrating
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0

begin migrating

begin migrating
Done with step 0
Done with step 0
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0
Done with step 0
Done with step 0
Done with step 1
Done with step 1
Done with step 1
Done with step 1
done migrating
done migrating
done migrating
done migrating
All tests passed
./charmrun ./pgm +p2 +vp2 +x2 +y1 +z1 ++remote-shell oarsh
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> started all node programs in 1.305 seconds.
Charmrun> IBVERBS version of charmrun
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 2 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.003 seconds.

begin migrating

begin migrating
Trying to migrate partition 1 from pe 1 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 1,
migrate_test is 1
Charmrun: error on request socket--
Socket closed before recv.
make: *** [bgtest] Error 1
./charmrun ./pgm +p1 +vp1 +x1 +y1 +z1 ++remote-shell oarsh ++verbose
Charmrun> charmrun started...
Charmrun> using ./nodelist as nodesfile
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> remote shell (edel-51.grenoble.grid5000.fr:0) started
Charmrun> node programs all started
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> remote responding...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> starting
node-program...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> rsh phase successful.
Charmrun> node programs all connected
Charmrun> started all node programs in 1.272 seconds.
Charmrun> adding client 0: "edel-51.grenoble.grid5000.fr", IP:172.16.16.101
Charmrun> Charmrun = 172.16.16.101, port = 39694
Charmrun> IBVERBS version of charmrun
start_nodes_rsh
Charmrun> Sending "0 172.16.16.101 39694 6824 0" to client 0.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration/./pgm" at
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration" for 0.
Charmrun> Starting oarsh edel-51.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Waiting for 0-th client to connect.
Charmrun> All clients connected.
Charmrun> IP tables sent.
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.003 seconds.

begin migrating
Done with step 0
Done with step 1
done migrating
All tests passed
Charmrun> Graceful exit.
./charmrun ./pgm +p1 +vp2 +x1 +y1 +z1 ++remote-shell oarsh ++verbose
Charmrun> charmrun started...
Charmrun> using ./nodelist as nodesfile
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> remote shell (edel-51.grenoble.grid5000.fr:0) started
Charmrun> node programs all started
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> remote responding...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> starting
node-program...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> rsh phase successful.
Charmrun> node programs all connected
Charmrun> started all node programs in 1.269 seconds.
Charmrun> adding client 0: "edel-51.grenoble.grid5000.fr", IP:172.16.16.101
Charmrun> Charmrun = 172.16.16.101, port = 50131
Charmrun> IBVERBS version of charmrun
start_nodes_rsh
Charmrun> Sending "0 172.16.16.101 50131 6844 0" to client 0.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration/./pgm" at
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration" for 0.
Charmrun> Starting oarsh edel-51.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Waiting for 0-th client to connect.
Charmrun> All clients connected.
Charmrun> IP tables sent.
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.002 seconds.

begin migrating

begin migrating
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0
Done with step 0
Done with step 0
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0
Done with step 1
Done with step 1
done migrating
done migrating
All tests passed
Charmrun> Graceful exit.
./charmrun ./pgm +p1 +vp4 +x1 +y1 +z1 ++remote-shell oarsh ++verbose
Charmrun> charmrun started...
Charmrun> using ./nodelist as nodesfile
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> remote shell (edel-51.grenoble.grid5000.fr:0) started
Charmrun> node programs all started
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> remote responding...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> starting
node-program...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> rsh phase successful.
Charmrun> node programs all connected
Charmrun> started all node programs in 1.279 seconds.
Charmrun> adding client 0: "edel-51.grenoble.grid5000.fr", IP:172.16.16.101
Charmrun> Charmrun = 172.16.16.101, port = 40819
Charmrun> IBVERBS version of charmrun
start_nodes_rsh
Charmrun> Sending "0 172.16.16.101 40819 6864 0" to client 0.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration/./pgm" at
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration" for 0.
Charmrun> Starting oarsh edel-51.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Waiting for 0-th client to connect.
Charmrun> All clients connected.
Charmrun> IP tables sent.
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.002 seconds.

begin migrating

begin migrating
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0

begin migrating

begin migrating
Done with step 0
Done with step 0
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0
Done with step 0
Done with step 0
Done with step 1
Done with step 1
Done with step 1
Done with step 1
done migrating
done migrating
done migrating
done migrating
All tests passed
Charmrun> Graceful exit.
./charmrun ./pgm +p2 +vp2 +x2 +y1 +z1 ++remote-shell oarsh ++verbose
Charmrun> charmrun started...
Charmrun> using ./nodelist as nodesfile
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> remote shell (edel-51.grenoble.grid5000.fr:0) started
Charmrun> remote shell (edel-67.grenoble.grid5000.fr:1) started
Charmrun> node programs all started
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> remote responding...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> starting
node-program...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> rsh phase successful.
Charmrun remote shell(edel-67.grenoble.grid5000.fr.1)> remote responding...
Charmrun remote shell(edel-67.grenoble.grid5000.fr.1)> starting
node-program...
Charmrun remote shell(edel-67.grenoble.grid5000.fr.1)> rsh phase successful.
Charmrun> node programs all connected
Charmrun> started all node programs in 1.304 seconds.
Charmrun> adding client 0: "edel-51.grenoble.grid5000.fr", IP:172.16.16.101
Charmrun> adding client 1: "edel-67.grenoble.grid5000.fr", IP:172.16.16.117
Charmrun> Charmrun = 172.16.16.101, port = 60741
Charmrun> IBVERBS version of charmrun
start_nodes_rsh
Charmrun> Sending "0 172.16.16.101 60741 6884 0" to client 0.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration/./pgm" at
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration" for 0.
Charmrun> Starting oarsh edel-51.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "1 172.16.16.101 60741 6884 0" to client 1.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration/./pgm" at
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration" for 1.
Charmrun> Starting oarsh edel-67.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Waiting for 0-th client to connect.
Charmrun> Waiting for 1-th client to connect.
Charmrun> All clients connected.
Charmrun> IP tables sent.
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 2 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.003 seconds.

begin migrating

begin migrating
Trying to migrate partition 1 from pe 1 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 1,
migrate_test is 1
Charmrun: error on request socket--
Socket closed before recv.
make: *** [bgtest] Error 1
./charmrun ./pgm +p1 +vp1 ++remote-shell oarsh ++verbose
Charmrun> charmrun started...
Charmrun> using ./nodelist as nodesfile
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> remote shell (edel-51.grenoble.grid5000.fr:0) started
Charmrun> node programs all started
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> remote responding...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> starting
node-program...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> rsh phase successful.
Charmrun> node programs all connected
Charmrun> started all node programs in 1.283 seconds.
Charmrun> adding client 0: "edel-51.grenoble.grid5000.fr", IP:172.16.16.101
Charmrun> Charmrun = 172.16.16.101, port = 58602
Charmrun> IBVERBS version of charmrun
start_nodes_rsh
Charmrun> Sending "0 172.16.16.101 58602 6712 0" to client 0.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration/./pgm" at
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration" for 0.
Charmrun> Starting oarsh edel-51.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Waiting for 0-th client to connect.
Charmrun> All clients connected.
Charmrun> IP tables sent.
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.003 seconds.

begin migrating
Done with step 0
Done with step 1
done migrating
All tests passed
Charmrun> Graceful exit.
./charmrun ./pgm +p1 +vp4 ++remote-shell oarsh ++verbose
Charmrun> charmrun started...
Charmrun> using ./nodelist as nodesfile
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> remote shell (edel-51.grenoble.grid5000.fr:0) started
Charmrun> node programs all started
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> remote responding...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> starting
node-program...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> rsh phase successful.
Charmrun> node programs all connected
Charmrun> started all node programs in 1.250 seconds.
Charmrun> adding client 0: "edel-51.grenoble.grid5000.fr", IP:172.16.16.101
Charmrun> Charmrun = 172.16.16.101, port = 34827
Charmrun> IBVERBS version of charmrun
start_nodes_rsh
Charmrun> Sending "0 172.16.16.101 34827 6733 0" to client 0.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration/./pgm" at
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration" for 0.
Charmrun> Starting oarsh edel-51.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Waiting for 0-th client to connect.
Charmrun> All clients connected.
Charmrun> IP tables sent.
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.002 seconds.

begin migrating

begin migrating
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0

begin migrating

begin migrating
Done with step 0
Done with step 0
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0
Done with step 0
Done with step 0
Done with step 1
Done with step 1
Done with step 1
Done with step 1
done migrating
done migrating
done migrating
done migrating
All tests passed
Charmrun> Graceful exit.
./charmrun ./pgm +p1 +vp20 ++remote-shell oarsh ++verbose
Charmrun> charmrun started...
Charmrun> using ./nodelist as nodesfile
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> remote shell (edel-51.grenoble.grid5000.fr:0) started
Charmrun> node programs all started
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> remote responding...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> starting
node-program...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> rsh phase successful.
Charmrun> node programs all connected
Charmrun> started all node programs in 1.231 seconds.
Charmrun> adding client 0: "edel-51.grenoble.grid5000.fr", IP:172.16.16.101
Charmrun> Charmrun = 172.16.16.101, port = 48967
Charmrun> IBVERBS version of charmrun
start_nodes_rsh
Charmrun> Sending "0 172.16.16.101 48967 6753 0" to client 0.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration/./pgm" at
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration" for 0.
Charmrun> Starting oarsh edel-51.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Waiting for 0-th client to connect.
Charmrun> All clients connected.
Charmrun> IP tables sent.
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.002 seconds.

begin migrating

begin migrating
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating
Done with step 0
Done with step 0
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
All tests passed
Charmrun> Graceful exit.
./charmrun ./pgm +p2 +vp1 ++remote-shell oarsh ++verbose
Charmrun> charmrun started...
Charmrun> using ./nodelist as nodesfile
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> remote shell (edel-51.grenoble.grid5000.fr:0) started
Charmrun> remote shell (edel-67.grenoble.grid5000.fr:1) started
Charmrun> node programs all started
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> remote responding...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> starting
node-program...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> rsh phase successful.
Charmrun remote shell(edel-67.grenoble.grid5000.fr.1)> remote responding...
Charmrun remote shell(edel-67.grenoble.grid5000.fr.1)> starting
node-program...
Charmrun remote shell(edel-67.grenoble.grid5000.fr.1)> rsh phase successful.
Charmrun> node programs all connected
Charmrun> started all node programs in 1.303 seconds.
Charmrun> adding client 0: "edel-51.grenoble.grid5000.fr", IP:172.16.16.101
Charmrun> adding client 1: "edel-67.grenoble.grid5000.fr", IP:172.16.16.117
Charmrun> Charmrun = 172.16.16.101, port = 41055
Charmrun> IBVERBS version of charmrun
start_nodes_rsh
Charmrun> Sending "0 172.16.16.101 41055 6773 0" to client 0.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration/./pgm" at
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration" for 0.
Charmrun> Starting oarsh edel-51.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "1 172.16.16.101 41055 6773 0" to client 1.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration/./pgm" at
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration" for 1.
Charmrun> Starting oarsh edel-67.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Waiting for 0-th client to connect.
Charmrun> Waiting for 1-th client to connect.
Charmrun> All clients connected.
Charmrun> IP tables sent.
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 2 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.003 seconds.

begin migrating
Done with step 0
Done with step 1
Done with step 2
done migrating
All tests passed
Charmrun> Graceful exit.
./charmrun ./pgm +p2 +vp4 ++remote-shell oarsh ++verbose
Charmrun> charmrun started...
Charmrun> using ./nodelist as nodesfile
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> remote shell (edel-51.grenoble.grid5000.fr:0) started
Charmrun> remote shell (edel-67.grenoble.grid5000.fr:1) started
Charmrun> node programs all started
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> remote responding...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> starting
node-program...
Charmrun remote shell(edel-51.grenoble.grid5000.fr.0)> rsh phase successful.
Charmrun remote shell(edel-67.grenoble.grid5000.fr.1)> remote responding...
Charmrun remote shell(edel-67.grenoble.grid5000.fr.1)> starting
node-program...
Charmrun remote shell(edel-67.grenoble.grid5000.fr.1)> rsh phase successful.
Charmrun> node programs all connected
Charmrun> started all node programs in 1.302 seconds.
Charmrun> adding client 0: "edel-51.grenoble.grid5000.fr", IP:172.16.16.101
Charmrun> adding client 1: "edel-67.grenoble.grid5000.fr", IP:172.16.16.117
Charmrun> Charmrun = 172.16.16.101, port = 58969
Charmrun> IBVERBS version of charmrun
start_nodes_rsh
Charmrun> Sending "0 172.16.16.101 58969 6797 0" to client 0.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration/./pgm" at
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration" for 0.
Charmrun> Starting oarsh edel-51.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Sending "1 172.16.16.101 58969 6797 0" to client 1.
Charmrun> find the node program
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration/./pgm" at
"/home/rktesser/Ondes3d-eval/charm.git/tests/ampi/migration" for 1.
Charmrun> Starting oarsh edel-67.grenoble.grid5000.fr -l rktesser /bin/sh -f
Charmrun> Waiting for 0-th client to connect.
Charmrun> Waiting for 1-th client to connect.
Charmrun> All clients connected.
Charmrun> IP tables sent.
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 2 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.003 seconds.

begin migrating

begin migrating
Trying to migrate partition 1 from pe 0 to 1

begin migrating

begin migrating
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Charmrun: error on request socket--
Socket closed before recv.
make: *** [test] Error 1
./charmrun ./pgm +p1 +vp1 ++remote-shell oarsh
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> started all node programs in 1.221 seconds.
Charmrun> IBVERBS version of charmrun
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.002 seconds.

begin migrating
Done with step 0
Done with step 1
done migrating
All tests passed
./charmrun ./pgm +p1 +vp4 ++remote-shell oarsh
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> started all node programs in 1.265 seconds.
Charmrun> IBVERBS version of charmrun
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.003 seconds.

begin migrating

begin migrating
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0

begin migrating

begin migrating
Done with step 0
Done with step 0
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0
Done with step 0
Done with step 0
Done with step 1
Done with step 1
Done with step 1
Done with step 1
done migrating
done migrating
done migrating
done migrating
All tests passed
./charmrun ./pgm +p1 +vp20 ++remote-shell oarsh
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> started all node programs in 1.288 seconds.
Charmrun> IBVERBS version of charmrun
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.003 seconds.

begin migrating

begin migrating
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating

begin migrating
Done with step 0
Done with step 0
Trying to migrate partition 1 from pe 0 to 0
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0, migrate_test
is 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 0
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
Done with step 1
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
done migrating
All tests passed
./charmrun ./pgm +p2 +vp1 ++remote-shell oarsh
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> started all node programs in 1.306 seconds.
Charmrun> IBVERBS version of charmrun
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 2 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.003 seconds.

begin migrating
Done with step 0
Done with step 1
Done with step 2
done migrating
All tests passed
./charmrun ./pgm +p2 +vp4 ++remote-shell oarsh
Warning> Invalid cpus 8 in nodelist ignored.
Charmrun> started all node programs in 1.303 seconds.
Charmrun> IBVERBS version of charmrun
Converse/Charm++ Commit ID: v6.5.0-beta1-322-gbd247e5
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 2 unique compute nodes (8-way SMP).
Charm++> cpu topology info is gathered in 0.003 seconds.

begin migrating

begin migrating
Trying to migrate partition 1 from pe 0 to 1

begin migrating

begin migrating
Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
migrate_test is 0
Charmrun: error on request socket--
Socket closed before recv.
make: *** [test] Error 1


  • [charm] MIgration error With AMPI + ibverbs, Rafael Keller Tesser, 03/21/2013

Archive powered by MHonArc 2.6.16.

Top of Page