next up previous index
Next: Partitioning MPI-IO/HDF5 Datasets Up: Hyperslab Selection Previous: Hyperslab Selection

Sequential Example

The program we are going to discuss in this section writes the following array on an HDF5 dataset:

\begin{displaymath}\begin{array}{cccccc}
0 & 1 & 2 & 3 & 4 & 5 \\
1 & 2 & 3 &...
...
3 & 4 & 5 & 6 & 7 & 8 \\
4 & 5 & 6 & 7 & 8 & 9
\end{array}\end{displaymath}

Then the dataset and the file get closed and then we open them again for reading. This time, though, we create a $7\times7\times3$ memory dataspace, which is filled with zeros initially. The following shows one of the 2-dimensional slices of this data space, e.g., the front slice:

\begin{displaymath}\begin{array}{ccccccc}
0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 ...
... 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0
\end{array}\end{displaymath}

Then we are going to read only a portion of the file dataspace onto this 3-dimensional cube of zeros - such a portion is called the hyperslab - and, to make things more fancy, we read the data onto a specific location in the memory data space, so that the front slice of the memory data space eventually looks as follows:

\begin{displaymath}\begin{array}{ccccccc}
0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 ...
... 7 & 8 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0
\end{array}\end{displaymath}

You can now appreciate how this is going to be useful in dividing the file dataspace amongst MPI processes, both for writing and for reading.

Here is the example program taken from the NCSA HDF5 Tutorial:

/************************************************************
  
  This example shows how to write and read a hyperslab.  It 
  is derived from the h5_read.c and h5_write.c examples in 
  the "Introduction to HDF5".

 ************************************************************/
 
#include "hdf5.h"

#define FILE        "sds.h5"
#define DATASETNAME "IntArray" 
#define NX_SUB  3                      /* hyperslab dimensions */ 
#define NY_SUB  4 
#define NX 7                           /* output buffer dimensions */ 
#define NY 7 
#define NZ  3 
#define RANK         2
#define RANK_OUT     3

#define X     5                        /* dataset dimensions */
#define Y     6

int
main (void)
{
    hsize_t     dimsf[2];              /* dataset dimensions */
    int         data[X][Y];            /* data to write */

    /* 
     * Data  and output buffer initialization. 
     */
    hid_t       file, dataset;         /* handles */
    hid_t       dataspace;   
    hid_t       memspace; 
    hsize_t     dimsm[3];              /* memory space dimensions */
    hsize_t     dims_out[2];           /* dataset dimensions */      
    herr_t      status;                             

    int         data_out[NX][NY][NZ ]; /* output buffer */
   
    hsize_t     count[2];              /* size of the hyperslab in the file */
    hssize_t    offset[2];             /* hyperslab offset in the file */
    hsize_t     count_out[3];          /* size of the hyperslab in memory */
    hssize_t    offset_out[3];         /* hyperslab offset in memory */
    int         i, j, k, status_n, rank;



/*********************************************************  
   This writes data to the HDF5 file.  
 *********************************************************/  
 
    /* 
     * Data  and output buffer initialization. 
     */
    for (j = 0; j < X; j++) {
	for (i = 0; i < Y; i++)
	    data[j][i] = i + j;
    }     
    /*
     * 0 1 2 3 4 5 
     * 1 2 3 4 5 6
     * 2 3 4 5 6 7
     * 3 4 5 6 7 8
     * 4 5 6 7 8 9
     */

    /*
     * Create a new file using H5F_ACC_TRUNC access,
     * the default file creation properties, and the default file
     * access properties.
     */
    file = H5Fcreate (FILE, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

    /*
     * Describe the size of the array and create the data space for fixed
     * size dataset. 
     */
    dimsf[0] = X;
    dimsf[1] = Y;
    dataspace = H5Screate_simple (RANK, dimsf, NULL); 

    /*
     * Create a new dataset within the file using defined dataspace and
     * default dataset creation properties.
     */
    dataset = H5Dcreate (file, DATASETNAME, H5T_STD_I32BE, dataspace,
                         H5P_DEFAULT);

    /*
     * Write the data to the dataset using default transfer properties.
     */
    status = H5Dwrite (dataset, H5T_NATIVE_INT, H5S_ALL, H5S_ALL,
                      H5P_DEFAULT, data);

    /*
     * Close/release resources.
     */
    H5Sclose (dataspace);
    H5Dclose (dataset);
    H5Fclose (file);
 

/*************************************************************  

  This reads the hyperslab from the sds.h5 file just 
  created, into a 2-dimensional plane of the 3-dimensional 
  array.

 ************************************************************/  

    for (j = 0; j < NX; j++) {
	for (i = 0; i < NY; i++) {
	    for (k = 0; k < NZ ; k++)
		data_out[j][i][k] = 0;
	}
    } 
 
    /*
     * Open the file and the dataset.
     */
    file = H5Fopen (FILE, H5F_ACC_RDONLY, H5P_DEFAULT);
    dataset = H5Dopen (file, DATASETNAME);

    dataspace = H5Dget_space (dataset);    /* dataspace handle */
    rank      = H5Sget_simple_extent_ndims (dataspace);
    status_n  = H5Sget_simple_extent_dims (dataspace, dims_out, NULL);
    printf("\nRank: %d\nDimensions: %lu x %lu \n", rank,
	   (unsigned long)(dims_out[0]), (unsigned long)(dims_out[1]));

    /* 
     * Define hyperslab in the dataset. 
     */
    offset[0] = 1;
    offset[1] = 2;
    count[0]  = NX_SUB;
    count[1]  = NY_SUB;
    status = H5Sselect_hyperslab (dataspace, H5S_SELECT_SET, offset, NULL, 
                                  count, NULL);

    /*
     * Define the memory dataspace.
     */
    dimsm[0] = NX;
    dimsm[1] = NY;
    dimsm[2] = NZ;
    memspace = H5Screate_simple (RANK_OUT, dimsm, NULL);   

    /* 
     * Define memory hyperslab. 
     */
    offset_out[0] = 3;
    offset_out[1] = 0;
    offset_out[2] = 0;
    count_out[0]  = NX_SUB;
    count_out[1]  = NY_SUB;
    count_out[2]  = 1;
    status = H5Sselect_hyperslab (memspace, H5S_SELECT_SET, offset_out, NULL, 
                                  count_out, NULL);

    /*
     * Read data from hyperslab in the file into the hyperslab in 
     * memory and display.
     */
    status = H5Dread (dataset, H5T_NATIVE_INT, memspace, dataspace,
                      H5P_DEFAULT, data_out);
    printf ("Data:\n ");
    for (j = 0; j < NX; j++) {
	for (i = 0; i < NY; i++) printf("%d ", data_out[j][i][0]);
	printf("\n ");
    }
	printf("\n");
    /*
     * 0 0 0 0 0 0 0
     * 0 0 0 0 0 0 0
     * 0 0 0 0 0 0 0
     * 3 4 5 6 0 0 0  
     * 4 5 6 7 0 0 0
     * 5 6 7 8 0 0 0
     * 0 0 0 0 0 0 0
     */

    /*
     * Close and release resources.
     */
    H5Dclose (dataset);
    H5Sclose (dataspace);
    H5Sclose (memspace);
    H5Fclose (file);

}
Here is how to compile, link and run the program:
gustav@bh1 $ h5cc -o h5_hyperslab h5_hyperslab.c
gustav@bh1 $ ./h5_hyperslab

Rank: 2
Dimensions: 5 x 6 
Data:
 0 0 0 0 0 0 0 
 0 0 0 0 0 0 0 
 0 0 0 0 0 0 0 
 3 4 5 6 0 0 0 
 4 5 6 7 0 0 0 
 5 6 7 8 0 0 0 
 0 0 0 0 0 0 0 
 
gustav@bh1 $

Now let us discuss the program in detail.

The program begins with initialization of the data that is going to be written on the file:

#define X     5                        /* dataset dimensions */
#define Y     6
...
    int         data[X][Y]; 
...
    for (j = 0; j < X; j++) {
        for (i = 0; i < Y; i++)
            data[j][i] = i + j;
    }
which should generate a $5\times6$ array of integers:
0 1 2 ...
1 2 3 ...
2 3 4 ...
...
The HDF5 data file is then created, we create a simple dataspace, then create a dataset associated with the dataspace, and then write all the data on the dataset filling it entirely. Then we close the dataspace, the dataset and the file:
    file = H5Fcreate (FILE, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
    dimsf[0] = X; dimsf[1] = Y;
    dataspace = H5Screate_simple (RANK, dimsf, NULL); 
    dataset = H5Dcreate (file, DATASETNAME, H5T_STD_I32BE, dataspace,
                         H5P_DEFAULT);
    status = H5Dwrite (dataset, H5T_NATIVE_INT, H5S_ALL, H5S_ALL,
                      H5P_DEFAULT, data);
    H5Sclose (dataspace);
    H5Dclose (dataset);
    H5Fclose (file);
So far there has been nothing new here. The H5Dwrite writes the whole lot and onto the whole lot: H5S_ALL, H5S_ALL.

The fancy stuff begins in the data reading part that follows.

We begin this part of the program by creating and initializing with zeros a new 3-dimensional array:

#define NX 7                           /* output buffer dimensions */ 
#define NY 7 
#define NZ 3 
...
    int         data_out[NX][NY][NZ ];
...

    for (j = 0; j < NX; j++) {
        for (i = 0; i < NY; i++) {
            for (k = 0; k < NZ ; k++)
                data_out[j][i][k] = 0;
        }
    }
Then we open the file again, but this time for reading only and open the dataset:
#define FILE        "sds.h5"
#define DATASETNAME "IntArray" 
...
    file = H5Fopen (FILE, H5F_ACC_RDONLY, H5P_DEFAULT);
    dataset = H5Dopen (file, DATASETNAME);
The following three calls extract information about the dataspace associated with the dataset:
    dataspace = H5Dget_space (dataset);    /* dataspace handle */
    rank      = H5Sget_simple_extent_ndims (dataspace);
    status_n  = H5Sget_simple_extent_dims (dataspace, dims_out, NULL);
    printf("\nRank: %d\nDimensions: %lu x %lu \n", rank,
           (unsigned long)(dims_out[0]), (unsigned long)(dims_out[1]));
and once we got its dimensions we print this information on standard output. We've seen this operation several times before already. The resulting output is:
Rank: 2
Dimensions: 5 x 6
Now we are going to narrow this dataspace by selecting a hyperslab, i.e., a $3\times4$ submatrix.
#define NX_SUB  3                      /* hyperslab dimensions */ 
#define NY_SUB  4 
...
    offset[0] = 1;
    offset[1] = 2;
    count[0]  = NX_SUB;
    count[1]  = NY_SUB;
    status = H5Sselect_hyperslab (dataspace, H5S_SELECT_SET, offset, NULL, 
                                  count, NULL);
Function  H5Sselect_hyperslab is going to narrow the dataspace pointed to by its first argument (the identifier) focusing on count[0] columns and count[1] rows starting from a corner whose coordinates are offset[0] and offset[1].

The second argument is the selection operator. The following selection operators can be used here:

H5S_SELECT_SET
Replaces  the existing selection with the parameters from this call. So this is what I meant when I said that we are going to narrow our original dataspace.
H5S_SELECT_OR
Adds  the new selection to the existing selection. This is the OR operator of the set theory.
H5S_SELECT_AND
Retains  only the overlapping portions of the new selection and the existing selection. This is the AND operator of the set theory.
H5S_SELECT_XOR
Retains  only the elements that are members of the new selection or the existing selection, excluding elements that are members of both selections. This is the exclusive OR of the set theory.
H5S_SELECT_NOTB
Retains  only elements of the existing selection that are not in the new selection. This is the NOT operator of the set theory.
H5S_SELECT_NOTA
Retains  only elements of the new selection that are not in the existing selection.
The two NULL parameters in this call correspond to the stride and the size of the block to be picked up. If the stride is NULL a contiguous hyperslab is selected. If the block size is NULL it defaults to one elementary data item.

So this is how we have constructed the hyperslab of the dataset on the file. This is what we are going to read. Now, where are we going to put it? To answer this question we define first the memory dataspace and then we take a hyperslab out of it:

#define NX 7 
#define NY 7 
#define NZ 3 
#define NX_SUB  3 
#define NY_SUB  4 
#define RANK_OUT 3
...
    dimsm[0] = NX;
    dimsm[1] = NY;
    dimsm[2] = NZ;
    memspace = H5Screate_simple (RANK_OUT, dimsm, NULL);   

    offset_out[0] = 3;
    offset_out[1] = 0;
    offset_out[2] = 0;
    count_out[0]  = NX_SUB;
    count_out[1]  = NY_SUB;
    count_out[2]  = 1;
    status = H5Sselect_hyperslab (memspace, H5S_SELECT_SET, offset_out, NULL, 
                                  count_out, NULL);
The space onto which we are going to write the data will begin from row number 3 (remember that in C rows are counted from 0), column zero and plane zero too - so we are going to write on the front plane of the brick. The extents are going to be 3 rows and 4 columns, and we are not going to go beyond the front plane.

Now we are ready to read the data:

    status = H5Dread (dataset, H5T_NATIVE_INT, memspace, dataspace,
                      H5P_DEFAULT, data_out);
The data is read from the already narrowed dataspace on the file into the just narrowed dataspace, here called memspace, in data_out.

The following print statements show the final effect of this operation:

    printf ("Data:\n ");
    for (j = 0; j < NX; j++) {
        for (i = 0; i < NY; i++) printf("%d ", data_out[j][i][0]);
        printf("\n ");
    }
    printf("\n");

The job done, we close the dataset, the dataspace, the memory dataspace, and finally the file:

    H5Dclose (dataset);
    H5Sclose (dataspace);
    H5Sclose (memspace);
    H5Fclose (file);

Now, at long last we can attempt an MPI-IO data transfer.


next up previous index
Next: Partitioning MPI-IO/HDF5 Datasets Up: Hyperslab Selection Previous: Hyperslab Selection
Zdzislaw Meglicki
2004-04-29