Skip to Content.
Sympa Menu

charm - Re: [charm] Migration error with AMPI + ibverbs (+ SMP)

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] Migration error with AMPI + ibverbs (+ SMP)


Chronological Thread 
  • From: "Jain, Nikhil" <nikhil AT illinois.edu>
  • To: Rafael Keller Tesser <rktesser AT inf.ufrgs.br>
  • Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: Re: [charm] Migration error with AMPI + ibverbs (+ SMP)
  • Date: Thu, 28 Mar 2013 00:33:03 +0000
  • Accept-language: en-US
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>


Randomization of stack impacts in two ways-
1. synchronization in the beginning via file system
2. restricted address space available for isomalloc

However, barring the initial sync, it has no performance impact. If you
can, I recommend switching off the randomization.

As for mapping of workers to comm threads, the only setting under user's
control is number of workers per comm thread. Based on application
pattern, one may want more worker threads per comm thread if most of the
time is spent in compute. If an application is communication intensive,
having more comm thread will be useful.

--Nikhil

--
Nikhil Jain,
nikhil AT illinois.edu,
http://charm.cs.uiuc.edu/people/nikhil
Doctoral Candidate @ CS, UIUC






On 3/23/13 7:50 AM, "Rafael Keller Tesser"
<rktesser AT inf.ufrgs.br>
wrote:

>Hello.
>
>/proc/sys/kernel/randomize_va_space is set to 2. So, the address
>space is randomized.
>I just did some executions with +isomalloc_sync and this seems to
>solve the segmentation violation problem.
>Does this option incur in some penalty in terms of performance or
>memory usage? Would it be better to disable address randomization?
>
>Regarding the mapping of workers to communication threads, do you know
>if there's any information on how this impacts performance? If so,
>where can I find the results?
>
>Thanks for your help. This will be very useful in my experiments.
>
>---
>Best regards,
> Rafael Keller Tesser
>
>GPPD - Grupo de Processamento Paralelo e Distribuído
>Instituto de Informática / UFRGS
>Porto Alegre, RS - Brasil
>
>
>On Fri, Mar 22, 2013 at 10:50 PM, Jain, Nikhil
><nikhil AT illinois.edu>
>wrote:
>> Answers to remaining questions follows:
>>
>>
>>
>>>On another topic, I am also interested in using the smp module. At
>>>first i was getting a migration error, then I found out I needed to
>>>pass the option +CmiNoProcForComThread to the runtime.
>>
>> For net-verbs, SMP mode makes use of an extra communication thread in
>> addition to the worker threads that perform the real work. The syntax
>>for
>> job run is-
>>
>> ./charmrun +p<NUM_WORKERS> ++ppn<WORKERS_PER_COMM_THREAD> ./pgm
>> <pgm_params> +vp<num_ranks> +isomalloc_sync +setcpuaffinity
>>
>> NUM_WORKERS is the total number of worker threads that will do real
>>work.
>> WORKER_PER_COMM_THREAD is the number of workers for which 1 comm thread
>> will be created. The sum of these two - number of worker threads and
>> number of comm threads determines the number of threads Charm creates.
>>If
>> this number is greater than the total number of cores allocated for a
>>job
>> run, you will need to pass +CmiNoProcForComThread so that threads share
>>a
>> core. A real example - given a job allocation of 32 cores, I use the
>> following to run-
>>
>> ./charmrun +p30 ++ppn 15 ./pgm 10 +vp100 +isomalloc_sync +setcpuaffinity
>>
>> This will create 2 processes each of which will create 16 threads - 15
>> workers and 1 comm thread. +isomalloc_sync is needed if the stack
>>pointer
>> is randomized which interferes with AMPI working correctly.
>>
>>
>>>So, now I can execute my application with ibverbs OR smp. But not with
>>>ibverbs AND smp together! When I try to run with the application on
>>>Charm++/AMPI built with ibverbs and smp, I get a segmentation
>>>violation error. When I compile the program with "-memory paranoid",
>>>the error disappears.
>>>
>>>Commands used to build Charm++ and AMPI:
>>> ./build charm++ net-linux-x86_64 ibverbs smp -j16 --with-production
>>>-thread context
>>> ./build AMPI net-linux-x86_64 ibverbs smp -j16 --with-production
>>>-thread context
>>>
>>>I am passing the following options to charmrun (on 4 nodes x 8 cores per
>>>node):
>>> ./charmrun ondes3d +p32 +vp 128 +mapping BLOCK_MAP ++remote-shell ssh
>>>+setcpuaffinity +balancer GreedyLB +CmiNoProcForComThread
>>>
>>>I also tested with the migration test program that came with Charm (in
>>>the subdirectory tests/ampi/migration). I doesn't give a segmentaion
>>>violation, but sometimes it hangs during migration. I included the
>>>output below this message.
>>
>> I tried these combination, and they execute well for me with -thread
>> context compilation, and +isomalloc_sync as a runtime parameter. Based
>>on
>> content of your /proc/sys/kernel/randomize_va_space, +isomalloc_sync may
>> also be required for your run. Can you tell me the content of that file,
>> and also try with this flag.
>>
>> --Nikhil
>>
>>>Any Idea on what the problem maybe?
>>>
>>>--
>>>Best regards,
>>> Rafael Keller Tesser
>>>
>>>GPPD - Grupo de Processamento Paralelo e Distribuído
>>>Instituto de Informática / UFRGS
>>>Porto Alegre, RS - Brasil
>>>
>>>-------------------
>>>****Output of the migration test program (until it hangs):****
>>>./charmrun ./pgm +p2 +vp4 +CmiNoProcForComThread
>>>Charmrun> IBVERBS version of charmrun
>>>Charmrun> started all node programs in 1.198 seconds.
>>>Converse/Charm++ Commit ID:
>>>Charm++> scheduler running in netpoll mode.
>>>CharmLB> Load balancer assumes all CPUs are same.
>>>Charm++> Running on 1 unique compute nodes (8-way SMP).
>>>Charm++> cpu topology info is gathered in 0.002 seconds.
>>>
>>>begin migrating
>>>
>>>begin migrating
>>>
>>>begin migrating
>>>
>>>begin migrating
>>>Trying to migrate partition 1 from pe 0 to 1
>>>Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
>>>migrate_test is 0
>>>Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 1,
>>>migrate_test is 0
>>>Done with step 0
>>>Done with step 0
>>>Done with step 0
>>>Done with step 0
>>>Trying to migrate partition 1 from pe 1 to 0
>>>Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 1,
>>>migrate_test is 1
>>>Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
>>>migrate_test is 1
>>>Done with step 1
>>>Done with step 1
>>>Done with step 1
>>>Done with step 1
>>>Trying to migrate partition 1 from pe 0 to 1
>>>Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
>>>migrate_test is 0
>>>Leaving TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 1,
>>>migrate_test is 0
>>>Done with step 2
>>>Done with step 2
>>>Done with step 2
>>>Done with step 2
>>>done migrating
>>>done migrating
>>>done migrating
>>>done migrating
>>>All tests passed
>>>./charmrun ./pgm +p2 +vp20 +CmiNoProcForComThread
>>>Charmrun> IBVERBS version of charmrun
>>>Charmrun> started all node programs in 1.174 seconds.
>>>Converse/Charm++ Commit ID:
>>>Charm++> scheduler running in netpoll mode.
>>>CharmLB> Load balancer assumes all CPUs are same.
>>>Charm++> Running on 1 unique compute nodes (8-way SMP).
>>>Charm++> cpu topology info is gathered in 0.002 seconds.
>>>
>>>begin migrating
>>>
>>>begin migrating
>>>Trying to migrate partition 1 from pe 0 to 1
>>>
>>>begin migrating
>>>Entering TCHARM_Migrate_to, FEM_My_partition is 1, CkMyPe() is 0,
>>>migrate_test is 0
>>>
>>>
>>>
>>>--------------------------------------------------------------
>>>***My previous message:****
>>>From: Rafael Keller Tesser
>>><rafael.tesser AT inf.ufrgs.br>
>>>Date: Thu, Mar 21, 2013 at 10:38 AM
>>>Subject: MIgration error With AMPI + ibverbs
>>>To:
>>>charm AT cs.uiuc.edu
>>>
>>>
>>>Hello,
>>>
>>>I ported a geophysics application to AMPI, in order to experiment with
>>>its load balancing features.
>>>
>>>Without load-balancing the application runs without any error, on both
>>>Gigabit Ethernet and Infiniband. With load-balancing, the application
>>>runs fine on Gigabit Ethernet.
>>>With the IBVERBS version of Charm, however, I am getting the following
>>>error, during the first load-balancing step:
>>>
>>>--
>>>...
>>>CharmLB> GreedyLB: PE [0] Memory: LBManager: 921 KB CentralLB: 87 KB
>>>CharmLB> GreedyLB: PE [0] #Objects migrating: 247, LBMigrateMsg size:
>>>0.02 MB
>>>CharmLB> GreedyLB: PE [0] strategy finished at 55.669918 duration
>>>0.007592 s
>>>[0] Starting ReceiveMigration step 0 at 55.672409
>>>Charmrun: error on request socket--
>>>Socket closed before recv.
>>>
>>>--
>>>
>>>I send the full output in a file attached to this message (output.txt).
>>>
>>>The error also happens with the AMPI migration test program that comes
>>>with charm++ (located in tests/ampi/migration). The outputs are
>>>attached to this message.
>>>
>>>I get this error both with Charm-6.4.0 and with the development
>>>version from the Git repository.
>>>
>>>AMPI was built with:
>>>./build charm++ net-linux-x86_64 ibverbs --with-production -j16
>>>./build AMPI net-linux-x86_64 ibverbs --with-production -j16
>>>
>>>
>>>Do you have any ideas on what may be causing this error?
>>>
>>>--
>>>Best regards,
>>> Rafael Keller Tesser
>>>
>>>GPPD - Grupo de Processamento Paralelo e Distribuído
>>>Instituto de Informática / UFRGS
>>






Archive powered by MHonArc 2.6.16.

Top of Page