next up previous index
Next: Program queryinfo.c Up: MPI IO Previous: Exercises

File Hints

Parallel files may be stored and accessed in great many ways that depend on the operating system, particular devices used for storage and any middleware that may live in between. In order to optimize file access the programmer may wish to provide additional information to MPI, in hope that MPI would know what to do with it. Such information is referred to as hints and there is a special MPI construct called the info object that is supposed to collect all the hints. Once constructed the info object can be passed to MPI_File_open, MPI_File_delete, MPI_File_set_view and MPI_File_set_info. It should be understood though that any hints you may wish to give MPI this way are only advisory and what MPI is going to do with them is implementation dependent.

The info object is opaque. MPI provides functions for its manipulation and inspection, but this is the only way you can (or should) deal with it. Information contained in the info object consists of (key, value) pairs - both are strings. Their maximum length allowed by an MPI implementation is given by constants MPI_MAX_INFO_KEY  and MPI_MAX_INFO_VAL , which can be looked up on mpi.h. Their values on the AVIDD cluster are:

gustav@bh1 $ grep MAX_INFO mpi.h
#define MPI_MAX_INFO_KEY       255
#define MPI_MAX_INFO_VAL      1024
gustav@bh1 $

An empty info object can be created with the call to  MPI_Info_create, whose synopsis is simply:

int MPI_Info_create(MPI_Info *info)
Once created, it can be populated with (key, value) pairs by calling  MPI_Info_set whose synopsis is
int MPI_Info_set(MPI_Info info, char *key, char *value)

An info object obtained, e.g., from an MPI file, can be interrogated in various ways. First you can find how many (key, value) pairs there are in the object by calling function  MPI_Info_get_nkeys which returns the number of keys currently defined within the object in its second argument:

int MPI_Info_get_nkeys(MPI_Info info, int *nkeys)
Once you know how many (key, value) pairs there are in it, you can inspect them by calling first  MPI_Info_get_nthkey in order to get the string that corresponds to the nth key - the synopsis of this function is:
int MPI_Info_get_nthkey(MPI_Info info, int n, char *key)
and then  MPI_Info_get to get the value associated with the key. The synopsis of MPI_Info_get is:
int MPI_Info_get(MPI_Info info, char *key, int valuelen, char *value, int *exists)
where exists is a Boolean flag that is set to TRUE (1) if the corresponding (key, value) pair exists and to FALSE (0) otherwise.

An info object can be duplicated and it can be also erased by freeing it. Specific (key, value) pairs may be deleted from an existing info object too.

MPI reserves certain hints, which are described in section, Reserved File Hints, of the MPI-2 Standard. Unfortunately the current MPICH2 implementation of MPI-2 ignores most of them, including file_perm, which would come handy in some of our example programs. The reserved hints cover parameters such as striping factor, striping unit, number of IO nodes (there are systems where not all nodes have IO capability), list of IO devices to store the file on, various chunking parameters, the size of buffers to be used for reading and writing, and the like.

Because the GPFS on the AVIDD system is accessed only through its UFS interface (MPI doesn't really know that GPFS, is GPFS), few of the reserved hints would be of much use anyway, even if MPICH2 did recognize them.

When a file is opened with the MPI_INFO_NULL parameter passed in place of an info object, a default info is used by MPI. The following program retrieves this info object and queries it. This shows us some of the info (key, value) pairs that are implemented by MPICH2 on the UFS at present.

next up previous index
Next: Program queryinfo.c Up: MPI IO Previous: Exercises
Zdzislaw Meglicki