next up previous index
Next: Defining File Views in Up: File Views Previous: File Views

Derived Datatypes in MPI

MPI defines the following basic data types, from which all other data types can be derived:

  This is the traditional ASCII character that is numbered by integers between 0 and 127.
  This is the extended character numbered by integers between 0 and 255.
  This is an 8-bit positive integer betwee 0 and 255, i.e., a byte.
  This is a wide character, e.g., a 16-bit character such as a Chinese ideogram.
  This is a 16-bit integer between -32,768 and 32,767.
  This is a 16-bit positive integer between 0 and 65,535.
  This is a 32-bit integer between -2,147,483,648 and 2,147,483,647.
  This is a 32-bit unsigned integer, i.e., a number between 0 and 4,294,967,295.
  This is the same as MPI_INT on IA32.
  This is the same as MPI\_UNSIGNED on IA32.
  This is a single precision, 32-bit long floating point number.
  This is a double precision, 64-bit long floating point number.
  This is a quadruple precision, 128-bit long floating point number.
  This is a 64-bit long signed integer, i.e., an integer number between -9,223,372,036,854,775,808 and 9,223,372,036,854,775,807 (this reads: 9 quintillions 223 quadrillions 372 trillions 36 billions 854 millions 775 thousand 8 hundred and seven - not a large sum of money by Microsoft standards).
  This is a pair of a 32-bit floating point number followed by a 32-bit integer.
  This is a pair of a 64-bit floating point number followed by a 32-bit integer.
  This is a pair of a long integer (which under IA32 is just a 32-bit integer) followed by a 32-bit integer.
  This is a pair of a 16-bit short integer followed by a 32-bit integer.
  This is a pair of two 32-bit integers.
  This is a pair of a quadruple precision 128-bit long floating point number and a 32-bit integer.
  The lower bound marker.
  The upper bound marker.

These basic types can then be used to construct new, more elaborate data structures, by calling MPI type constructors. The type constructors characterize all these new data structures simply in terms of a type map, which is a list of pairs, each of which comprises a type and a displacement:

\begin{displaymath}\textrm{type map}
= ( (\textrm{type}_0, \textrm{displacement}...
(\textrm{type}_n, \textrm{displacement}_n)

where the displacement is the number of bytes from the beginning of the structure at which the given data item appears in it.

This is a very low level characterization of the data structure and it is likely to lack portability between, e.g., 32-bit and 64-bit systems - due to, e.g., different meanings of ``long'' and ``long long'' on these systems.

The secret to writing portable programs is never to assume any lengths and instead to call sizeof whenever you need to use the lengths of various types in your code. You can then help yourself by calling MPI address and extent functions to check on the exact locations of various items in the structures. The locations may not always be what you think, because in some cases the data may have to be aligned at word or half-word boundaries, in which case the compiler may pad the data, i.e., insert empty space between, e.g., a character and the half-word boundary. Following the last typed data item in the structure, located at some displacement, we may still find some padding, if the data structure has to be aligned with the word (half-word) boundary.
Consider the following example:

\put(0,0) {\framebox (0.5,0.5)}
... \put(9,0) {\framebox (0.5,0.5)}
\put(9.5,0){\framebox (0.5,0.5)}
Here we have a data object that begins with four bytes of padding, i.e., there is nothing there in the first four bytes. Then we have two characters, a and b, followed again by two bytes of padding. Then we have 2 32-bit floats f and g, which are adjacent, and this is followed immediately by a 16-bit short integer h and then again 2 bytes of padding.

Let us call this object

MPI_Datatype new_type
This new MPI type comprises five blocks of the following items.

Now we convert this verbal description into the following call to function  MPI_Type_create_struct as follows:

MPI_Type_create_struct(5, array_of_block_lengths, array_of_displacements, 
                       array_of_types, &new_type);
array_of_block_lengths = (1, 2, 2, 1, 1)
array_of_displacements = (0, 4, 8, 16, 20)
array_of_types         = (MPI_LB, MPI_CHAR, MPI_FLOAT, MPI_SHORT, MPI_UB)

Observe that if this MPI structure corresponds to some C-language structure in your program, it is your responsibility, as the programmer, to ensure that the two are indeed the same.

Function MPI_Type_create_struct is an MPI-2 function. This same function is called  MPI_Type_struct in MPI-1.

In order to find out how a C-language structure is laid out in the memory of your computer you can use function  MPI_Get_address in combination with sizeof.

Consider, for example, the following C-language (not an MPI) structure:

typedef struct {
   char   flavor;
   char   color;
   int    charge;
   double mass;
   double x, y, z;
   double px, py, pz;
} Quark;

Quark tau;
To find how this structure should be described to MPI, you could do something like:
int quark_address, flavor_address, color_address, charge_address, mass_address,
    x_address, y_address, z_address, px_address, py_address, pz_address;
int flavor_offset, color_offset, charge_offset, mass_offset, ...
MPI_Get_address(&tau, &tau_address);
MPI_Get_address(&tau.flavor, &flavor_address);
MPI_Get_address(&tau.color, &color_address);
MPI_Get_address(&tau.charge, &charge_address);
MPI_Get_address(&tau.mass, &mass_address);
flavor_offset = flavor_address - tau_address;
color_offset = color_address - tau_address;
charge_offset = charge_address - tau_address;
mass_offset = mass_address - tau_address;
Then having these numbers in hand and using sizeof to get sizes of char, double, etc., you would describe this structure to MPI by calling MPI_Type_create_struct with appropriate parameters.
Function MPI_Get_address is an MPI-2 function. There is an identical function in MPI-1, which is called  MPI_Address.

MPI_Type_create_struct is one of the more general type constructors in MPI. There are more specific constructors, of which we have already encountered MPI_Type_contiguous  and MPI_Type_vector .

A type constructor, which is somewhere in between MPI_Type_create_struct and MPI_Type_vector is  MPI_Type_indexed. It is like MPI_Type_vector, meaning that it works with data items of the same type, but as in MPI_Type_struct the data can be placed at quite arbitrary locations in memory, not necessarily with a regular stride and regular block length.

next up previous index
Next: Defining File Views in Up: File Views Previous: File Views
Zdzislaw Meglicki