Mapping 
IntroductionPPM topologies implicitly define a datatoprocessor assignment. Mapping routines provide the functionality of sending particles and field blocks to the proper processor, i.e. the one that 'owns' the corresponding subdomain(s) of the computational space. Three different mapping types are provided for both particles and field data:
In addition, a special ring shift mapping is provided for particle data on the ring topology, and a connection mapping is provided for taking into account links between particles (e.g. chemical bonds, triangular surface elements, unstructured mesh elements). All mapping types are organized as stacks. A mapping operation consists of 4 steps:
This architecture allows data stored in different arrays to be sent together to minimize network latency, and mapping definitions to be reused by repeatedly calling the push/send/pop sequence for the same persistent mapping definition. The individual mapping types only differ in their definition step, while push, send, and pop are identical. All mapping subroutines of PPM are available in separate optimized versions for scalar and vector data. Supported data types for particles and fields are: single and double precision floating point, single and double precision complex numbers, integer numbers, and logical values. Different data types can be mixed within the same stack, in which case they are converted to the stack data type as defined by the PPM initialization routine. Mappings of field data can be masked. An optional binary mask selects which mesh points are to be mapped and which ones are not. The values of nonmapped points will remain unaffected by the mapping operation. Global mappingThe global mapping in the PPM library involves communication of all processors with all others. Individual communication steps are synchronous and scheduled such that no conflicts occur. Thus, all processors send to i + r (mod Nproc) while they receive from i − r (mod Nproc). This is repeated for all . This mapping type is used to perform the initial datatoprocessor assignment or to switch from the current topology to another one. The definition of a global mapping involves the creation of an index list of particles or mesh blocks that fall outside the local processor and a list of their new processor affiliation. As an example, the global mapping of Np particles with the locations xp (ndims x Np floating point array) and vector properties wp (nprop x Np floating point array) consists of the following sequence of routine calls: Np is the number of nonghost particles on the local processor before the mapping, Mp is the new number of particles after the mapping, and tid is the unique identification number of the target topology for the mapping. This target topology needs to be defined earlier using the PPM topology creation routine. The error status of each step is returned in info. This status argument is present in all PPM routines. For the scalar version of the mapping routines, the second argument is absent and the first one is a rank one array. For reasons of efficiency, the first routing call not only defines the mapping (i.e.~creates all the lists), but also directly pushes the particle locations onto the send stack. The pop action thus needs to be issued once more than the push action. Mapping an ndimdimensional vector field fld, that is defined on the mesh mid of the current topology, requires the following sequence of routine calls: mid identifies the target mesh, which needs to be defined on the target topology tid. The width of the ghost layer in terms of the number of mesh points is given in gs for each spatial direction. The scalar version of the routine lacks the second argument and the rank of the field array fld is reduced by one. Partial mappingDuring a partial mapping, each processor only communicates with those processors that have subdomains that are adjacent to any of its own subdomains. The optimal communication sequence would satisfy that no conflicts occur and the minimum number of communication steps is needed. The problem can be formulated in a graph representation where each processor is a node of the graph. An edge in the graph denotes a ``neighborhood relation, i.e. a necessary communication. The goal is to find a coloring of the edges such that no two edges of the same color meet in any node, and such that the minimum number of different colors (corresponding to communication steps) is needed. This minimum edge coloring problem is complete (Holyer, 1981). An efficient approximate solution can be found using the algorithm of Vizing (Vizing, 1964). This solution guarantees that at most one color more than the minimum number is used (Caprara, 1998). PPM uses this algorithm to precompute and store the communication protocol for each defined topology. The partial mapping is mainly used to account for particle motion during a simulation. Periodic boundary conditions at the outer faces of the computational domain are automatically taken into account. Defining a partial mapping is similar to defining a global mapping, except that the parameter ppm_param_map_partial is used in the first call to the corresponding mapping routine. Receiving ghostsSubdomains are extended by a layer of ghost mesh points, ghost particles, or both. These ghosts are copies of true mesh points or particles that either reside on a neighboring processor or account for periodic boundary conditions. Ghost particles or mesh points are needed for all local operations such as finite support particleparticle interactions, finite difference stencils, or particlemesh interpolations. PPM provides mapping routines (ppm_map_part_ghost} and (ppm_map_field_ghost) for handling such ghost layers. Receiving ghosts and obtaining the copied values is done using the ppm_param_map_ghost_get parameter. For particles, ghosts can have both a position and values. Notice that in the case of stationary particles, the stack architecture of PPM's mapping allows to update the ghost values without redefining the lists or resending ghost positions. When performing symmetric particleparticle interactions or using particletomesh interpolation, the value at the location of a ghost may change. This gives rise to ghost contributions to the corresponding real particle or mesh point on the source processor. In order to add these contributions back onto the proper real element, PPM provides a ghost sending mechanism (ppm_param_map_ghost_put). The library keeps track of which real element is the preimage of what ghosts and o f possible periodic images in the case of periodic boundary conditions. Ring topologyWhen the PPM ring topology is used to compute direct interactions or to distribute data of not globally known processor affiliation, the ring shift mapping is used. In this mapping, each processor keeps a local copy of its dataset while a second copy is sent around the ring. This means that processor i receives a dataset from processor i1 (mod Nproc) while sending its previous set to i+1 (mod Nproc). The ring shift mapping performs one such step upon each call. For the traveling set to visit all processors, the ring shift has to be executed Nproc − 1 times. After each ring shift, each processor can perform certain operations using its original local dataset as well as the current traveling set. Particle connectionsTo allow disjoint initialization or input of particles and the corresponding connections, the latter are typically initialized or read separately and are not sorted by processor. The PPM connection mapping is provided to properly distribute the connections among the processors. The ring topology is conveniently used for this mapping, in which each processor picks those connections from the data being transmitted on the ring who have entries that correspond to one of its particles. After the connection mapping, every processor will have those connections that are associated with any of its particles. This allows for a nonsymmetric calculation of the interactions. If symmetry is used then only one processor should perform the calculation. The mapping is thus followed by a pruning of the connections to assure that only the processor that has all the member particles of the connection will keep it. This is a sufficient condition since the ghost layers in PPM are nonoverlapping. As the particles move to other processors during the course of the simulation, the connections are updated according to the partial mapping of the particles. In the case of symmetric interaction evaluations, the mapping is followed by the pruning step as described above to remove redundant connections.
