Skip to content

H2 Matrix File Storage Scheme

Edmond Chow edited this page Mar 25, 2021 · 18 revisions

H2Pack can read an H2 matrix from a pair of files, and similarly write an H2 matrix to a pair of files. This functionality can be used to exchange H2 matrices between different H2 matrix packages. Below, the file storage format is described. This is a proposed format and is subject to change.

For the application interface and examples, see Using H2 Matrix File Storage.

Overview

An H2 matrix is stored in two files: the structure of the H2 matrix is stored in a JavaScript Object Notation (JSON) file, and the H2 matrix components are stored in a binary file. The JSON file contains the metadata of the H2 matrix and is text-based and human-readable, while the binary file stores the floating point values of the H2 matrix components.

Some H2 matrix packages may have special functionality and a third file could be used to support this functionality. For example, a third JSON file is used by H2Pack to store the coordinates of the points associated with a kernel matrix, as well as other quantities.

The format described below is for general, nonsymmetric H2 matrices. Some simplifications, also described below, are used for symmetric H2 matrices.

Data types

  • Indices are 0-based
  • 4-byte signed integer is used for integer data
  • 8-byte floating point is used for floating point data

Notation

An H2 matrix is composed of the following.

  1. For H2 matrices specified by two sets of points (necessarily the case if the matrix is rectangular rather than square), then two partition trees are needed, one to partition the rows, and one to partition the columns of the matrix. For matrices specified by a single set of points (necessarily the case if the matrix is symmetric), then a single partition tree for partitioning the rows is needed. The same partition tree is used for partitioning the columns in this case.

  2. Basis matrices for matrix rows, specifically, Ui for each leaf node i and Ri for each nonleaf node i. The nested basis matrices Ui for a nonleaf node i with children i1, i2, ..., is are not explicitly formed and are defined as

    Ui = diag(Ui1, Ui2, ..., Uis) Ri

  3. Basis matrices for matrix columns, specifically, Vj for each leaf node j and Sj for each nonleaf node j. The nested basis matrices Vj for a nonleaf node j with children j1, j2, ..., js are not explicitly formed and are defined as

    Vj = diag(Vj1, Vj2, ..., Vjs) Sj

    For symmetric matrices, only the basis matrices for the rows need to be stored. The basis matrices for the columns are assumed to be the same.

  4. Intermediate blocks Bij used in the approximations Ui Bij VjT to admissible blocks

  5. Inadmissible blocks Dij

Example

The figure below shows a symmetric H2 matrix with a 4 level binary partition tree.

On the left is the partition tree for the rows (and columns) of the matrix. On the right is the H2 matrix. All the red blocks are inadmissible blocks Dij (e.g., D8,9 in the figure). All the green blocks are admissible blocks and are compressed in low-rank form Aij = Ui Bij UjT (e.g., A3,5 in the figure).

In this example, all the stored components are:

  1. basis matrices U10, U11, U12, U13, U14, and R3, R4, R5, R6
  2. intermediate blocks B7,9, B7,10, B8,10, B9,11, B9,12, B10,12, B11,13, B11,14, B12,14 and B3,5, B3,6, B4,6
  3. inadmissible blocks D7,7, D8,8, ..., D14,14, and D7,8, D8,9, ..., D13,14

Note that blocks Dij and Bij that are made redundant by symmetry are not stored. Either one of a pair of blocks that are symmetric with each other can be stored.

Partially admissible blocks

Recall that an admissible block Aij is approximated in low-rank form as Ui Bij VjT. The file format described here also allows the storage of partially admissible blocks Aij approximated as Bij VjT or Ui Bij where Bij is computed in a different way than Bij for admissible blocks.

Partially admissible blocks are a special subset of low-rank blocks in H2 matrix representations. Such blocks could appear when applying a nonsymmetric admissibility condition to specify which blocks are treated as low-rank. For example, when using the admissibility condition in FMM, there can be a block Aij where box i is admissible to box j but not the other way around. In this case, Aij is approximated as Bij VjT sharing the basis matrix Vj with node j but not that with node i. (This approximation corresponds to the S2M->M2M->M2L operation in FMM).

If partially admissible blocks are not used, then all low-rank blocks in the H2 matrix are treated as typical admissible blocks. For more details, please refer to: H. Huang, X. Xing, and E. Chow, H2Pack: High-performance H2 matrix package for kernel matrices using the proxy point method, ACM Transactions on Mathematical Software, 47(1), Article 3 (2020).

Remark: H2Pack constructs H2 matrices with partially admissible blocks, depending on the geometry of the points. If you need H2Pack to output a file that does not contain partially admissible blocks, please contact the authors of H2Pack. (H2Pack can naturally read and use H2 matrices without partially admissible blocks.)

A. Metadata JSON file

The metadata JSON file describes the structure of an H2 matrix. It has 18 JSON name/value pairs. In the following list, each bold word is the name of a name/value pair in the JSON file, and is followed by the description of its value(s) and requirements.

1. nrow_matrix

Value: Number of rows in the H2 matrix.

Requirement: Positive integer.

2. ncol_matrix

Value: Number of columns in the H2 matrix.

Requirement: Positive integer.

3. is_symmetric

Value: Whether the H2 matrix is symmetric.

Requirement: 0 (nonsymmetric) or 1 (symmetric).

Note: If the H2 matrix is symmetric, only one partition tree and one set of basis matrices Ui and Ri are stored. Due to the symmetry Bji = BijT and Dji = DjiT, only one of the two blocks that are symmetric with each other is stored. There is no specific restriction on which one of the two will be stored.

4. num_node_row

Value: Number of nodes in the partition tree for matrix rows.

Requirement: Positive integer.

5. num_node_col

Value: Number of nodes in the partition tree for matrix columns.

Requirement: Positive integer.

6. root_node_row

Value: Index of the root node in the partition tree for matrix rows.

Requirement: Non-negative integer, 0 <= root_node_row < num_node_row.

7. root_node_col

Value: Index of the root node in the partition tree for matrix columns.

Requirement: Non-negative integer, 0 <= root_node_col < num_node_col.

8. num_level_row

Value: Number of levels in the partition tree for matrix rows.

Requirement: Non-negative integer.

Note: The root node is on level 0, the children of the root is on level 1, and so on. The bottom level is indexed bynum_level_row-1.

9. num_level_col

Value: Number of levels in the partition tree for matrix columns.

Requirement: Non-negative integer.

10. nodes_row

Value: JSON object array of size num_node_row. Each object has 6 name/value pairs describing a node in the partition tree for matrix rows. Each node is associated with a contiguous cluster of rows of the stored matrix.

  1. index

    Value: Index of the node.

    Requirement: Non-negative integer, 0 <= index < num_node.

  2. level

    Value: Level of the node.

    Requirement: Non-negative integer, 0 <= level < num_level_row.

  3. cluster_head

    Value: Index of the first row in the row subset associated with the node.

    Requirement: Non-negative integer, 0 <= cluster_head <= cluster_tail < nrow_matrix.

  4. cluster_tail

    Value: Index of the last row in the row subset associated with the node.

    Requirement: Non-negative integer, 0 <= cluster_head <= cluster_tail <= nrow_matrix-1.

  5. num_children

    Value: Number of children of the node.

    Requirement: Non-negative integer, 0 <= num_children.

  6. children

    Value: JSON integer array of size num_children, each integer is the index of a child node.

    Requirement: Non-negative integer, 0 <= index < num_node_row.

11. nodes_col

Value: JSON object array of size num_node_row. Each object has the same 6 name/value pairs as nodes_row and specifies the partition tree for matrix columns. Each node is associated with a contiguous cluster of columns in the stored matrix.

Note: Not used if the H2 matrix is symmetric.

12. basis_matrices_row

Value: JSON object array of size num_node_row. Each object has 3 name/value pairs describing the size information of the basis matrices Ui or Ri associated with nodes in the partition tree for matrix rows.

  1. node

    Value: Index of the node associated with this basis matrix.

    Requirement: Non-negative integer, 0 <= node < num_node_row.

  2. num_row

    Value: Number of rows of the basis matrix.

    Requirement: Non-negative integer.

  3. num_col

    Value: Number of columns of the basis matrix.

    Requirement: Non-negative integer. num_col <= num_row.

13. basis_matrices_col

Value: JSON object array of size num_node_col. Each object has 3 name/value pairs describing the size information of the basis matrices Vi or Si associated with nodes in the partition tree for matrix columns.

Note: Not used if the H2 matrix is symmetric.

14. num_admissible_blocks

Value: Number of admissible blocks Bij stored.

Requirement: Non-negative integer.

Note: Blocks made redundant by symmetry are not counted.

15. num_inadmissible_blocks

Value: Number of inadmissible blocks Dij stored.

Requirement: Non-negative integer.

Note: Blocks made redundant by symmetry are not counted.

16. has_partial_adm_blocks

Value: Whether or not the stored matrix defines partially admissible blocks.

Requirement: 0 or 1.

17. B_matrices

Value: JSON object array of size num_admissible_blocks. Each object has 5 name/value pairs describing the size information of the intermediate blocks Bij with all admissible node pairs.

  1. node_row

    Value: Index of the row tree node that is associated with the rows of the Bij matrix.

    Requirement: Non-negative integer, 0 <= node_row < num_node_row.

  2. node_col

    Value: Index of the column tree node that is associated with the columns of the Bij matrix.

    Requirement: Non-negative integer, 0 <= node_col < num_node_col.

  3. num_row

    Value: Number of rows of the Bij matrix.

    Requirement: Non-negative integer.

  4. num_col

    Value: Number of columns of the Bij matrix.

    Requirement: Non-negative integer.

  5. is_part_adm

    Value: Whether this admissible node pair is defined to be partially admissible.

    Requirement: 0 or 1.

18. D_matrices

Value: JSON object array of size num_inadmissible_blocks. Each object has 4 name/value pairs describing the information of dense inadmissible blocks Dij with all inadmissible node pairs.

Note: For a symmetric H2 matrix, we require that the JSON file first stores the information of diagonal blocks Dii, and then stores the information of off-diagonal blocks Dij.

  1. node_row

    Value: Index i of the row tree node that is associated with the rows of the Dij matrix.

    Requirement: Non-negative integer, 0 <= node_row < num_node_row.

  2. node_col

    Value: Index $j$ of the row column node that is associated with the columns of the Dij matrix.

    Requirement: Non-negative integer, 0 <= node_col < num_node_col.

  3. num_row

    Value: Number of rows of the Dij matrix.

    Requirement: Non-negative integer.

  4. num_col

    Value: Number of columns of the Dij matrix.

    Requirement: Non-negative integer.

B. Binary data file

The binary data file stores the numerical values in the basis matrices, the Bij matrices, and the Dij matrices. Little-endian is used. All matrix entries are stored in row-major order.

1. Basis matrices for matrix rows

The first section of the binary file stores the basis matrices Ui and Ri for matrix rows. These basis matrices must be stored in the same order as stored in the JSON file (see A.12 basis_matrices_row above).

2. Basis matrices for matrix columns

The next section of the binary file stores the basis matrices Vi and Si for matrix columns. These basis matrices must be stored in the same order as stored in the JSON file (see A.13 basis_matrices_col above).

Note: If the H2 matrix is symmetric, these basis matrices for matrix columns are not stored.

3. B matrices

The next section of the binary file stores the Bij matrices. The Bij matrices must be stored in the same order as stored in the JSON file (see A.17 above).

4. D matrices

The last section of the binary file stores the Dij matrices. The Dij matrices must be stored in the same order as stored in the JSON file (see A.18 above).

C. Optional auxiliary JSON file

An H2 matrix library can store additional information it provides, uses, or needs in an optional auxiliary JSON file. The following defines the auxiliary JSON file supported by H2Pack for symmetric kernel functions. The use of this file is optional and is only required to support additional functionality in H2Pack.

  1. dim_point

    Value: Dimension of points.

    Requirement: Non-negative integer.

  2. dim_kernel

    Value: Dimension of kernel function.

    Requirement: Non-negative integer, and nrow_matrix=dim_point*dim_kernel.

    Examples: The Gaussian kernel has dimension 1 and the Stokes kernel has dimension 3.

  3. num_point

    Value: Number of points.

    Requirement: Non-negative integer.

  4. point_coordinate

    Value: JSON object array of size num_point x dim_point containing the coordinates of all the points.

    Requirement: Floating point numbers.

    Note: The first dim_point values are the coordinate of the first point, the second dim_point values are the coordinate of the second point, and so on.

  5. permutation_array

    Value: JSON object array of size num_point containing the permutation of point_coordinate to match the ordering of the rows and columns of the stored H2 matrix.

    Requirement: A permutation vector containing 0, 1, ..., num_point-1.

    Note: Input points are often permuted before constructing the H2 matrix, and the values in cluster_head (see A.10.3) and cluster_tail (see A.10.4) are based on the permuted coordinates. The i-th value in the point permutation array is the original point index of the i-th permuted point.

  6. skeleton_points

    Value: JSON object array of size num_node. Each object has 3 name/value pairs below that contains the skeleton point set information associated with one node.

    • node

      Value: Index of the node for the skeleton points being specified.

      Requirement: Non-negative integer, 0 <= node < num_node.

    • num_skeleton_point

      Value: Number of skeleton points.

      Requirement: Non-negative integer.

    • skeleton_point_indices

      Value: JSON integer array of size num_skeleton_point. Each integer is the index of a skeleton point.

      Requirement: these indices must be a subset of the point indices associated with the given node.

Clone this wiki locally