Skip to Content.
Sympa Menu

charm - [charm] recursive strassen

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] recursive strassen


Chronological Thread 
  • From: Nicolas Bock <nicolasbock AT gmail.com>
  • To: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
  • Subject: [charm] recursive strassen
  • Date: Mon, 8 Jul 2013 17:08:35 -0600
  • List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
  • List-id: CHARM parallel programming system <charm.cs.uiuc.edu>

Hi,

I am trying to write a Strassen-like matrix-matrix multiply. I chose to express a matrix as a quadtree of chares, which at the leaf nodes, store the actual matrix elements. To get started, I implemented the standard O(N^3) MM multiply. The multiply is implemented "recursively", and at each tier, the multiply method is called 8 times until at the leaf tier, the matrix elements are multiplied. For synchronization, I use CkFuture, i.e. I create 8 futures, pass those to the next recursion level, and then wait until all 8 are done with CkWaitFuture(). The matmul() method essentially looks like the following:

  1 void Node::matmul (CProxy_Node A, CProxy_Node B, CkFuture f)
  2 {
  3   int width = iUpper-iLower;
  4 
  5   if(width == 1) { data += AData->data*BData->data; }
  6   else
  7   {
  8     CkFuture product_f[8];
  9     int numberProducts = 0;
 10     for(int i = 0; i < 2; i++) {
 11       for(int j = 0; j < 2; j++) {
 12         for(int k = 0; k < 2; k++)
 13         {
 14           product_f[numberProducts] = CkCreateFuture();
 15           child[i*2+j]->matmul(*AInfo->child[i*2+k],
 16               *BInfo->child[k*2+j],
 17               product_f[numberProducts]);
 18           numberProducts++;
 19         } 
 20       } 
 21     } 
 22     
 23     for(int i = 0; i < numberProducts; i++)
 24     {
 25       EmptyMsg *m = (EmptyMsg*) CkWaitFuture(product_f[i]); delete m;
 26       CkReleaseFuture(product_f[i]);
 27     } 
 28   } 
 29   CkSendToFuture(f, new EmptyMsg());
 30 } 

For smaller matrices, this works great, however, for somewhat larger ones I get incorrect results. I am having a tough time debugging what the problem is. For example, when I CkPrintf() something in the innermost loop before calling child->matmul, and count how many times this line was printed, I find fewer lines than the expected \sum_{i = 1}^{d} 8^d, where d is the depth of the matrix tree: N = { 2, 4, 8, 16 } results in { 8, 72, 548, 4023 } lines printed, respectively.

What confuses me is that it seems as if I am not properly synchronizing execution and the code prematurely ends up in the main method and CkExit()'s. Is the problem that CkWaitFuture() suspends execution, making the matmul() method effectively reentrant, causing the product_f[[] array to be overwritten in the process? Or why else would line 25 not properly synchronize execution?

Thanks already,

nick


  • [charm] recursive strassen, Nicolas Bock, 07/08/2013

Archive powered by MHonArc 2.6.16.

Top of Page