next up previous index
Next: Writing and Reading MPI Up: MPI IO Previous: MPI IO

Manipulating MPI Files

MPI files are not like ordinary files and so MPI provides special functions for opening them, closing them and doing other things with them.

The first thing you always want to do with a file is to open it. The call to do that in C is:

int MPI_File_open (MPI_Comm comm, char *filename, int amode, MPI_Info info,
                   MPI_File *fh);
and in Fortran
MPI_FILE_OPEN(INTEGER COMM, CHARACTER FILENAME(*), INTEGER AMODE, &
              INTEGER INFO, INTEGER FH, INTEGER IERROR)
This call must be issued by all processes participating in the communicator. It is a blocking call and a barrier call. This call sets a default view of the file, about which more later.

There are 3 arguments in this call the likes of which we haven't encountered yet. The first one is amode, which stands for the access mode. All processes opening the file must open it in the same access mode. The access modes can be as follows

MPI_MODE_RDONLY|
read only
MPI_MODE_RDWR|
read and write
MPI_MODE_WRONLY|
write only
MPI_MODE_CREATE|
create the file if it does not exist
MPI_MODE_EXCL|
raise an error if the file already exists and MPI_MODE_CREATE is specified
MPI_MODE_DELETE_ON_CLOSE|
delete file on close
MPI_MODE_UNIQUE_OPEN|
file will not be opened concurrently elsewhere
MPI_MODE_SEQUENTIAL|
file will only be accessed sequentially
MPI_MODE_APPEND|
set initial position of all file pointers to end of file
They can be combined like this in C:
MPI_MODE_WRONLY | MPI_MODE_CREATE | MPI_MODE_EXCL
and in Fortran you can use:
IOR(MPI_MODE_WRONLY, MPI_MODE_CREATE)
The second argument that is specific to MPI_FILE_OPEN is info. If you use GPFS to write on then this parameter should be left empty. Simply specify:
info = MPI_INFO_NULL
If you use HPSS then there is quite a lot that you can put in info. The info object can be loaded with information about how the file should be striped, what permissions it should be created with, how many parallel processes will typically access it, and so on. You can also load it with HPSS specific information of which the most important is HPSS class of service, HPSS type of storage class, and hints about the anticipated size of the file.

The info object is created by calling function

int MPI_Info_create(MPI_Info *info);
in C and$\ldots$ there is no support for Fortran interface in the HPSS version of MPI/IO. Once you have created your info object then you can begin loading it by calling function MPI_Info_set. For example:
MPI_Info_set(info, "hpss_cos", "3");
MPI_Info_set(info, "hpss_sclasstype", "DISK");
MPI_Info_set(info, "access_style", "write_once");
Once you no longer need an info object you can free it with
MPI_Info_free(info);
There are many other things you can do with an info object. You can enquire about the number of key-value pairs. You can enquire about the keys and then ask about the values. You can duplicate an info object and you can modify any value in a selected key-value pair. There are functions for all that. But, since none of that stuff is currently supported on GPFS, we won't dwell on it any more.

MPI_File_open is a blocking call. The function does not return until the file has been opened and the file handle returned. This may take a very long time, especially on a parallel file system, and an even longer time with HPSS, where numerous transactions have to be exchanged with Encina in order to open a new file. It would be good if you could send a file open request to the system and then go away and do other things, occasionally checking if the file is ready.

Once you have opened a file you will probably write on it or read from it, but this is a complex affair in MPI, so we'll postpone the discussion of this until later.

To close an MPI file simply say:

int MPI_File_Close (MPI_File *fh);
in C and in Fortran
MPI_FILE_CLOSE (FH, IERROR)
INTEGER FH, IERROR
You must ensure that all requests associated with fh have completed before you call MPI_File_Close. This is a collective barrier operation.

To delete an MPI file call function

int MPI_File_delete (char *filename, MPI_Info info);
in C and
MPI_FILE_DELETE(FILENAME, INFO, IERROR)
CHARACTER(*) FILENAME
INTEGER INFO, IERROR
As with open you can pass various hints to the file system you work with by loading them into info. GPFS, as I said before, doesn't support this feature at this stage, so here you have to use MPI_INFO_NULL. And HPSS doesn't have any special hints for deleting either, so here again, you just use MPI_INFO_NULL.

MPI files can be statically sized. This is important because if multiple processes write on the same file in parallel, you must have a static file map, so that the processes don't write on each other's blocks. You set a size of an MPI file with

int MPI_File_set_size (MPI_File fh, MPI_Offset size);
in C and
MPI_FILE_SET_SIZE (FH, SIZE, IERROR)
INTEGER FH
INTEGER(KIND=MPI_OFFSET_KIND) SIZE,
INTEGER IERROR
in Fortran. Observe that size is of type MPI_Offset. This is going to be a 64 bit integer on GPFS and HPSS (long long in C and INTEGER(KIND=8) in Fortran). There must be no pending I/O operations when you call this function. You can use this function both to increase and to decrease the size of the file.

If you have a file which is already opened and sized and don't remember what the size was use

int MPI_File_get_size (MPI_File fh, MPI_Offset size);
in C and
MPI_FILE_GET_SIZE (FH, SIZE, IERROR)
INTEGER FH
INTEGER(KIND=MPI_OFFSET_KIND) SIZE,
INTEGER IERROR
in Fortran.

You can also alter an info record on a file by calling MPI_File_set_info and you can read the info record associated with an open file by calling MPI_File_get_info.


next up previous index
Next: Writing and Reading MPI Up: MPI IO Previous: MPI IO
Zdzislaw Meglicki
2001-02-26