Skip to Content.
Sympa Menu

charm - [charm] Charm++ v6.8.0 Release Candidate

charm AT lists.cs.illinois.edu

Subject: Charm++ parallel programming system

List archive

[charm] Charm++ v6.8.0 Release Candidate


Chronological Thread 
  • From: "White, Samuel T" <white67 AT illinois.edu>
  • To: "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>
  • Subject: [charm] Charm++ v6.8.0 Release Candidate
  • Date: Wed, 26 Jul 2017 14:26:16 +0000
  • Accept-language: en-US
  • Authentication-results: illinois.edu; spf=pass smtp.mailfrom=white67 AT illinois.edu

Hello everyone,

We have just tagged a release candidate for the next stable release, version
6.8.0, of Charm++. This version has passed all of our automated tests, and
has been tested by various Charm++ users already. If there are no major
regressions found in it, we will tag the 6.8.0 release next week. You can
check out the release in the charm git repo on the branch 'charm-6.8' or at
the git tag 'v6.8.0-rc1'.

Version 6.8.0 is a feature release for Charm++ and AMPI that includes over
600 bug fixes, improvements, and cleanups. Here's a list of some of the major
ones:


Charm++ Features

- Calls to entry methods taking a single fixed-size parameter can now
automatically be aggregated and routed through the TRAM library by marking
them with the [aggregate] attribute.

- Calls to parameter-marshalled entry methods with large array arguments can
ask for asynchronous zero-copy send behavior with a `nocopy' tag in the
parameter's declaration.

- Calls to chare array element entry methods with the [inline] tag now avoid
copying their arguments when the called method takes its parameters by
const&, offering a substantial reduction in overhead.

- Synchronous entry methods that block until completion (marked with the
[sync] attribute) can now return any type that defines a PUP method, rather
than only message types.

- The runtime system now integrates an OpenMP runtime library so that code
using OpenMP parallelism will dispatch work to idle worker threads within the
Charm++ process.

- Applications can ask the runtime system to perform automatic high-level
end-of-run performance analysis by linking with `-tracemode perfReport'.

- Added a new dynamic remapping/load-balancing strategy, GreedyRefineLB, that
offers high result quality and well bounded execution time.

- Improved and expanded topology-aware spanning tree generation strategies,
including support for runs on a torus with holes, such as Blue Waters and
other Cray XE/XK systems.

- Charm++ programs can now define their own main() function, rather than
using a generated implementation from a mainmodule/mainchare. This extends
our existing support for Charm++/MPI interoperation.

- Added support for malleable jobs that can dynamically shrink and expand the
set of compute nodes hosting Charm++ processes.

- Greatly expanded and improved reduction operations:

* Added built-in reductions for all logical and bitwise operations on
integer and boolean input.

* Reductions over groups and chare arrays that apply commutative,
associative operations (e.g. MIN, MAX, SUM, AND, OR, XOR) are now processed
in a streaming fashion. This reduces the memory footprint of reductions.
User-defined reductions can opt into this mode as well.

* Added a new `Tuple' reducer that allows combining multiple reductions of
different input data and operations from a common set of source objects to a
single target callback.

* Added a new `Summary Statistics' reducer that provides count, mean, and
standard deviation using a numerically-stable streaming algorithm.

- Improvements to Sections:

* Array sections API has been simplified, with array sections being
automatically delegated to CkMulticastMgr.

* Group sections can now be delegated to CkMulticastMgr, offering improved
performance for multicasts and reductions over them. Note that they have to
be manually delegated.

- GPU manager now creates one instance per OS process and scales the
pre-allocated memory pool size according to the GPU memory size and number of
GPU manager instances on a physical node.

- Several GPU Manager API changes including:

* Replaced references to global variables in the GPU manager API with calls
to functions.

* The user is no longer required to specify a bufferID in dataInfo struct.

* Replaced calls to kernelSelect with direct invocation of functions passed
via the work request object (allows CUDA to be built with all programs).

- Added a `++quiet' option to suppress charmrun and charm++ non-error
messages at startup.

- Static (non-generated) header files are now warning-free for gcc -Wall
-Wextra -pedantic.

- Deprecated setReductionClient and CkSetReductionClient in favor of
explicitly passing callbacks to contribute calls.

- On C++ standard library implementations with support for
std::is_constructible (e.g. GCC libstdc++ >4.5), chare array elements only
need to define a constructor taking CkMigrateMessage* if it will actually be
migrated.

- The PUP serialization framework gained support for some C++11 library
classes, including unique_ptr and unordered_map, when the underlying types
have PUP operators.


AMPI Features

- More efficient implementations of message matching infrastructure, multiple
completion routines, and all varieties of reductions and gathers.

- Support for user-defined non-commutative reductions, MPI_BOTTOM, cancelling
receive requests, MPI_THREAD_FUNNELED, PSCW synchronization for RMA, and more.

- Fixes to AMPI's extensions for load balancing and to Isomalloc on SMP
builds.

- More robust derived datatype support, optimizations for truly contiguous
types.

- ROMIO is now built on AMPI and linked in by ampicc by default.


Platforms and Portability

- The runtime system code now requires compiler support for C++11 R-value
references and move constructors. This is not expected to be incompatible
with any currently supported compilers.

- The next feature release (anticipated to be 6.9.0 or 7.0) will require full
C++11 support from the compiler and standard library.

- Added support for IBM POWER8 systems with the PAMI communication API.
Contributed by Sameer Kumar of IBM.

- Mac OS (darwin) builds now default to the modern libc++ standard library
instead of the older libstdc++.

- Blue Gene/Q build targets have been added for the `bgclang' compiler.

- Charm++ can now be built on Cray's CCE 8.5.4+.

- Charm++ will now build without custom configuration on Arch Linux

- Charmrun can automatically detect rank and node count from Slurm/srun
environment variables.

- Many obsolete architecture, network, and compiler support files have been
removed. These include:

* IBM Blue Gene/P

* Sony/Toshiba/IBM Cell (including PlayStation 3)

* Cray XT

* Intel IA-64 (Itanium)

* Intel x86-32 for Windows, Mac OS X (darwin), and Solaris

* Cygwin for Windows

* Older IBM AIX/POWER configurations

* GCC 3 and KAI compilers

* Sun/Oracle Solaris


Thank you,
Sam White
white67 AT illinois.edu


  • [charm] Charm++ v6.8.0 Release Candidate, White, Samuel T, 07/26/2017

Archive powered by MHonArc 2.6.19.

Top of Page