Skip to Content.
Sympa Menu

charm - Re: [charm] [ppl] multi-thread in taub

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] [ppl] multi-thread in taub


Chronological Thread 
  • From: Fernando Stump <fernando.stump AT gmail.com>
  • To: Gengbin Zheng <zhenggb AT gmail.com>
  • Cc: Phil Miller <mille121 AT illinois.edu>, Charm Mailing List <charm AT cs.illinois.edu>
  • Subject: Re: [charm] [ppl] multi-thread in taub
  • Date: Fri, 21 Oct 2011 19:04:19 -0500
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Hi Gengbin,


Is there pros and cons in each option? How do I choose?

Thanks
Fernando
On Oct 21, 2011, at 4:56 PM, Gengbin Zheng wrote:

> Basically yes, you can add __thread to your variables.
> However, since Converse thread is user level threads, which is not
> like pthreads as TLS is designed for, it won't work automatically. You
> have two options (after adding __thread):
>
> 1. Make Converse thread implemented as pthread, to do that link with
> -thread pthreads
>
> 2. Use special implementation of Converse thread to swap TLS when
> these user level thread context switch, to do that, you need to
> compile and link your program with charmc flag:
> -tlsglobals
>
>
> Gengbin
>
> On Fri, Oct 21, 2011 at 12:02 PM, Fernando Stump
> <fernando.stump AT gmail.com>
> wrote:
>> Hi,
>> I have another question related to multi-threading. here it is:
>> My Problem: My original serial code uses a lot of static variables /
>> static
>> member functions . To preserve the semantics of the code, I would like that
>> this static variables are local to each thread. (i.e. If I create a static
>> variable inside driver()) each driver will. see a static variable)
>> Solutions:
>> Aaron proposed this solution:
>> You're right, this is a potential source of errors--all drivers on a
>> processor will share any static members. One workaround to this issue
>> if you
>> need to keep the static members is to replace your static
>> property_set with a static map<int, property_set> which has an entry
>> for each chare on the processor. So, instead of accessing
>> set_.whatever(), you would instead access set_[CkMyPE()].whatever().
>> This has potentially bad cache behavior, but it does allow you to
>> retain the static variables and still get correct results.
>> Googling I found another potential solution and I would like to hear your
>> opinions:
>> Solution : Use the specifier __thread
>> As in
>>
>> static __thread char *p;
>>
>> Here it is the documentation of this specifier at gcc
>> http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Thread_002dLocal.html
>> Is there a chance it will work? I do not understande very well how does
>> ParFUM/Charm++ create threads or assign Chares/Driver() to threads.
>> Thanks
>> Fernando
>>
>>
>>
>> On Oct 7, 2011, at 2:40 PM, Phil Miller wrote:
>>
>> On Fri, Oct 7, 2011 at 14:22, Fernando Stump
>> <fernando.stump AT gmail.com>
>> wrote:
>>
>> Hi,
>>
>> I'm running my ParFUMized version of my code at the taub cluster at uiuc.
>> Each node contains 12 processors. I'm running in one node, with option
>> +p2,
>> but I have the feeling that the code is running in a single processor. My
>> clue is that this is related with this "warning"
>>
>> Charm++> Running on MPI version: 2.2 multi-thread support: 0 (max
>> supported:
>> -1)
>>
>> This is a detail of the underlying MPI implementation. It doesn't mean
>> Charm++ is running on only 1 thread.
>>
>> My question is:
>>
>> Where it is the issue? Is it on how MPI was compiled or on how charm++ was
>> compiled or on how I call charmrun?
>>
>> Here it is the full call.
>>
>> [fstump2@taubh2
>> io]$ ../yafeq/build/debug/yafeq/charmrun
>> ../yafeq/build/debug/yafeq/pfem.out +p2
>>
>> Running on 2 processors: ../yafeq/build/debug/yafeq/pfem.out
>>
>> charmrun> /usr/bin/setarch x86_64 -R mpirun -np 2
>> ../yafeq/build/debug/yafeq/pfem.out
>>
>> Charm++> Running on MPI version: 2.2 multi-thread support: 0 (max
>> supported:
>> -1)
>>
>> Charm++> Running on 1 unique compute nodes (12-way SMP).
>>
>> Charm++> Cpu topology info:
>>
>> PE to node map: 0 0
>>
>> Node to PE map:
>>
>> Chip #0: 0 1
>>
>> Charm++> cpu topology info is gathered in 0.003 seconds.
>>
>> This output seems to indicate that things are working correctly. You
>> got two cores, 0 and 1, on chip 0 of node 0. Did some other indication
>> lead you to the conclusion that only one core was doing work?
>>
>> Phil
>>
>>
>> _______________________________________________
>> charm mailing list
>> charm AT cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/charm
>>
>> _______________________________________________
>> ppl mailing list
>> ppl AT cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/ppl
>>
>>






Archive powered by MHonArc 2.6.16.

Top of Page