Skip to Content.
Sympa Menu

charm - Re: [charm] Weird behaviour using MPI_Alloc_mem and MPI_Free_mem

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] Weird behaviour using MPI_Alloc_mem and MPI_Free_mem


Chronological Thread 
  • From: Phil Miller <mille121 AT illinois.edu>
  • To: Roberto de Quadros Gomes <rqg.gomes AT gmail.com>
  • Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: Re: [charm] Weird behaviour using MPI_Alloc_mem and MPI_Free_mem
  • Date: Fri, 25 Oct 2013 14:58:44 -0700
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

I've incorporated a fix for this in the repository. If you could confirm that your application code is no longer impacted by this, I'll close the bug report.


On Thu, Oct 24, 2013 at 3:33 PM, Phil Miller <mille121 AT illinois.edu> wrote:
This is indeed a bug in our implementation of AMPI_Alloc_mem/AMPI_Free_mem. The issue is that we have some infrastructure that does a sort of 'mode switch' inside the implementation of AMPI, when a call crosses the boundary from client application code. One thing this mode switch does is turn off the isomalloc migratable memory allocator.

I think in this case, the correct solution would be to just exclude those particular functions, since they're explicitly wrappers around malloc/free, which we implement internally anyway, rather than doing something more involved. The alternative design would be to augment that mode switch infrastructure to let these functions behave differently.


On Wed, Oct 23, 2013 at 3:43 PM, Roberto de Quadros Gomes <rqg.gomes AT gmail.com> wrote:
Hi,

Last week I had a problem very similar to presented by Nicolas Bock about segmentation fault when LB was called.  So, I noticed that my problem started when I started use MPI_Alloc_mem and MPI_Free_mem in AMPI codes. Always when I have a migration between these functions I receive these messages:

CharmLB> GreedyLB: PE [0] step 0 finished at 2.566594 duration 0.008998 s

------------- Processor 1 Exiting: Caught Signal ------------
Signal: segmentation violation
Suggestion: Try running with '++debug', or linking with '-memory paranoid' (memory paranoid requires '+netpoll' at runtime).
------------- Processor 7 Exiting: Caught Signal ------------
Signal: segmentation violation
Suggestion: Try running with '++debug', or linking with '-memory paranoid' (memory paranoid requires '+netpoll' at runtime).
------------- Processor 6 Exiting: Caught Signal ------------
Signal: segmentation violation
Suggestion: Try running with '++debug', or linking with '-memory paranoid' (memory paranoid requires '+netpoll' at runtime).
Charmrun: error on request socket--
Socket closed before recv.

If no migrations is needed, then no problem is shown.

But, whether I back to "malloc" and "free" functions, the LB works fine.

I do not have sure about this functions should be similar, but in my application without MPI_Migrate, works too.

I attached the code where this problem happens.
You can reproduce  building with

.$ampicc mpiteste.c -o mpiteste -module GreedyLB -memory isomalloc -D_Problem

and running it.
If you remove -D_Problem, it works. 



_______________________________________________
charm mailing list
charm AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/charm






Archive powered by MHonArc 2.6.16.

Top of Page