SSP Project Summary:
MPI Datatypes Toolset
[EPCC home] [SSP home] [2001 projects] [2000 projects] [1999 projects] [1998 projects] [1997 projects] [1996 projects] [1995 projects] [1994 projects] [1993 projects]

One of the major areas of difficulty in using MPI is in defining MPI derived datatypes that match application data structures. Since MPI is a library it has no knowledge of the layout of program data determined by the compiler. Thus, layout information has to be acquired explicitly and fed into the MPI datatype constructors. This can be laborious and error-prone. These problems can be alleviated by development of an MPI datatypes toolset. Two ways in which assistance can be provided are:

Even though reasonably clear examples are given in the MPI standard and elsewhere, defining datatypes to match C structures is not elegant. The formulaic nature of this procedure suggests development of a pre-processing tool to generate functions/macros that create MPI datatypes for marked structure definitions, perhaps similar to the automatic generation of XDR (eXternal Data Repn) encoding/decoding code via the rpcgen tool. An example of a marked C structure is given below. This technique can, in principle, also be applied to Fortran 77 COMMON blocks, and Fortran 90 derived datatypes. This will require development of a parser that locates the marked sections and parses the datatype definition. To create a matching MPI derived datatype, it is necessary to obtain the name of the type, and the name and C type of each of its fields. A sensible initial limitation would be to disallow recursive types.
#pragma MPIDT_STRUCT begin
  struct MsgObj 
{    unsigned char dest;
    char data[MSG_LEN];
    int crc;
  };
  #pragma MPIDT_STRUCT end

  int MPIDT_create_struct_MsgObj(MPI_Datatype *newtype);
It is possible to provide a portable library of calls that create datatypes for regular structures, such as the representation of a column in an array that can be composed as though a basic type. (It is relatively easy to define an MPI derived datatype with a stride covering the correct distance in an array, but to compose this datatype the extent must be adjusted. This adjustment is possible but requires a detailed understanding of MPI datatype construction.) The library should use only standard MPI datatype constructors and so will be portable to any MPI implementation on any platform.

Direction in the library development may benefit from a survey of use of MPI datatypes in various application areas. As an example, the prototype of a datatype constructor analogous to the Process Topologies sub-space partition, for data arrays, is given below.

MPIDT_ARRAY_SUB(eltype, ndims, dims, remain, newtype)

IN      eltype     basic datatype of element (handle) 
        IN      ndims      number of array dimensions (integer)
        IN      dims       integer array of size ndims specifying
                           the array size in each dimension 
        IN      remain     logical array of size ndims specifying
                           the dimensions covered by newtype
        OUT     newtype    new datatype (handle) 
The suggested schedule for this project is as follows:

Expertise Required

The emphasis of the project will be on development of the compiler-level tools, requiring development of a parser. This will likely involve use of lex and yacc parser generation tools, and so will require C programming skills. Knowledge of compiler techniques would be useful, suggesting a student with a Computer Science background.

The extended datatype constructor library could be implemented in either C or Fortran. If this project were to involve a survey of applications use of MPI datatypes, some knowledge of computational science would be useful. This part of the project could be made more suitable for a student from Physical Sciences; it is possible that there would be a shortfall in project effort in restricting this project to the library development part.

Resources Required

The tools will be developed on workstations and can be tested on any MPI platform (EPCC or X-lab workstations, Meikos, T3D). There is no requirement for visualisation capability. It is likely that the compiler-level project will require UNIX tools lex (or flex) and yacc (or bison), which are available on all EPCC workstations at least.

Resources Supplied

The compiler-level tool may benefit from publically available grammars for C, Fortran 77, Fortran 90 data structure definition; there are known sources for at least C and Fortran 77. The language syntax definitions will be required. Particularly for a survey of MPI datatypes use, but useful for testing of any parts of the project, access to a set of existing MPI applications (perhaps resulting from other SSP projects).

References

Werner Augustin worked on this project.

Compressed PostScript of the project's final report is available here (53 kbytes) .

Webpage maintained by mario@epcc.ed.ac.uk