H2 Matrix File Storage Scheme

H2Pack can read an H2 matrix from a pair of files, and similarly write an H2 matrix to a pair of files. This functionality can be used to exchange H2 matrices between different H2 matrix packages. Below, the file storage format is described. This is a proposed format and is subject to change.

For the application interface and examples, see Using H2 Matrix File Storage.

Overview

An H2 matrix is stored in two files: the structure of the H2 matrix is stored in a JavaScript Object Notation (JSON) file, and the H2 matrix components are stored in a binary file. The JSON file contains the metadata of the H2 matrix and is text-based and human-readable, while the binary file stores the floating point values of the H2 matrix components.

Some H2 matrix packages may have special functionality and a third file could be used to support this functionality. For example, a third JSON file is used by H2Pack to store the coordinates of the points associated with a kernel matrix, as well as other quantities.

The format described below is for general, nonsymmetric H2 matrices. Some simplifications, also described below, are used for symmetric H2 matrices.

Data types

Indices are 0-based
4-byte signed integer is used for integer data
8-byte floating point is used for floating point data

Notation

An H2 matrix is composed of the following.

For H2 matrices specified by two sets of points (necessarily the case if the matrix is rectangular rather than square), then two partition trees are needed, one to partition the rows, and one to partition the columns of the matrix. For matrices specified by a single set of points (necessarily the case if the matrix is symmetric), then a single partition tree for partitioning the rows is needed. The same partition tree is used for partitioning the columns in this case.
Basis matrices for matrix rows, specifically, U_i for each leaf node i and R_i for each nonleaf node i. The nested basis matrices U_i for a nonleaf node i with children i₁, i₂, ..., i_s are not explicitly formed and are defined as

U_i = diag(U_i₁, U_i₂, ..., U_{i_s}) R_i
Basis matrices for matrix columns, specifically, V_j for each leaf node j and S_j for each nonleaf node j. The nested basis matrices V_j for a nonleaf node j with children j₁, j₂, ..., j_s are not explicitly formed and are defined as

V_j = diag(V_j₁, V_j₂, ..., V_{j_s}) S_j

For symmetric matrices, only the basis matrices for the rows need to be stored. The basis matrices for the columns are assumed to be the same.
Intermediate blocks B_ij used in the approximations U_i B_ij V_j^T to admissible blocks
Inadmissible blocks D_ij

Example

The figure below shows a symmetric H2 matrix with a 4 level binary partition tree.

On the left is the partition tree for the rows (and columns) of the matrix. On the right is the H2 matrix. All the red blocks are inadmissible blocks D_ij (e.g., D_8,9 in the figure). All the green blocks are admissible blocks and are compressed in low-rank form A_ij = U_i B_ij U_j^T (e.g., A_3,5 in the figure).

In this example, all the stored components are:

basis matrices U₁₀, U₁₁, U₁₂, U₁₃, U₁₄, and R₃, R₄, R₅, R₆
intermediate blocks B_7,9, B_7,10, B_8,10, B_9,11, B_9,12, B_10,12, B_11,13, B_11,14, B_12,14 and B_3,5, B_3,6, B_4,6
inadmissible blocks D_7,7, D_8,8, ..., D_14,14, and D_7,8, D_8,9, ..., D_13,14

Note that blocks D_ij and B_ij that are made redundant by symmetry are not stored. Either one of a pair of blocks that are symmetric with each other can be stored.

Partially admissible blocks

Recall that an admissible block A_ij is approximated in low-rank form as U_i B_ij V_j^T. The file format described here also allows the storage of partially admissible blocks A_ij approximated as B_ij V_j^T or U_i B_ij where B_ij is computed in a different way than B_ij for admissible blocks.

Partially admissible blocks are a special subset of low-rank blocks in H2 matrix representations. Such blocks could appear when applying a nonsymmetric admissibility condition to specify which blocks are treated as low-rank. For example, when using the admissibility condition in FMM, there can be a block A_ij where box i is admissible to box j but not the other way around. In this case, A_ij is approximated as B_ij V_j^T sharing the basis matrix V_j with node j but not that with node i. (This approximation corresponds to the S2M->M2M->M2L operation in FMM).

If partially admissible blocks are not used, then all low-rank blocks in the H2 matrix are treated as typical admissible blocks. For more details, please refer to: H. Huang, X. Xing, and E. Chow, H2Pack: High-performance H2 matrix package for kernel matrices using the proxy point method, ACM Transactions on Mathematical Software, 47(1), Article 3 (2020).

Remark: H2Pack constructs H2 matrices with partially admissible blocks, depending on the geometry of the points. If you need H2Pack to output a file that does not contain partially admissible blocks, please contact the authors of H2Pack. (H2Pack can naturally read and use H2 matrices without partially admissible blocks.)

A. Metadata JSON file

The metadata JSON file describes the structure of an H2 matrix. It has 18 JSON name/value pairs. In the following list, each bold word is the name of a name/value pair in the JSON file, and is followed by the description of its value(s) and requirements.

1. nrow_matrix

Value: Number of rows in the H2 matrix.

Requirement: Positive integer.

2. ncol_matrix

Value: Number of columns in the H2 matrix.

Requirement: Positive integer.

3. is_symmetric

Value: Whether the H2 matrix is symmetric.

Requirement: 0 (nonsymmetric) or 1 (symmetric).

Note: If the H2 matrix is symmetric, only one partition tree and one set of basis matrices U_i and R_i are stored. Due to the symmetry B_ji = B_ij^T and D_ji = D_ji^T, only one of the two blocks that are symmetric with each other is stored. There is no specific restriction on which one of the two will be stored.

4. num_node_row

Value: Number of nodes in the partition tree for matrix rows.

Requirement: Positive integer.

5. num_node_col

Value: Number of nodes in the partition tree for matrix columns.

Requirement: Positive integer.

6. root_node_row

Value: Index of the root node in the partition tree for matrix rows.

Requirement: Non-negative integer, 0 <= root_node_row < num_node_row.

7. root_node_col

Value: Index of the root node in the partition tree for matrix columns.

Requirement: Non-negative integer, 0 <= root_node_col < num_node_col.

8. num_level_row

Value: Number of levels in the partition tree for matrix rows.

Requirement: Non-negative integer.

Note: The root node is on level 0, the children of the root is on level 1, and so on. The bottom level is indexed bynum_level_row-1.

9. num_level_col

Value: Number of levels in the partition tree for matrix columns.

Requirement: Non-negative integer.

10. nodes_row

Value: JSON object array of size num_node_row. Each object has 6 name/value pairs describing a node in the partition tree for matrix rows. Each node is associated with a contiguous cluster of rows of the stored matrix.

index

Value: Index of the node.

Requirement: Non-negative integer, 0 <= index < num_node.
level

Value: Level of the node.

Requirement: Non-negative integer, 0 <= level < num_level_row.
cluster_head

Value: Index of the first row in the row subset associated with the node.

Requirement: Non-negative integer, 0 <= cluster_head <= cluster_tail < nrow_matrix.
cluster_tail

Value: Index of the last row in the row subset associated with the node.

Requirement: Non-negative integer, 0 <= cluster_head <= cluster_tail <= nrow_matrix-1.
num_children

Value: Number of children of the node.

Requirement: Non-negative integer, 0 <= num_children.
children

Value: JSON integer array of size num_children, each integer is the index of a child node.

Requirement: Non-negative integer, 0 <= index < num_node_row.

11. nodes_col

Value: JSON object array of size num_node_row. Each object has the same 6 name/value pairs as nodes_row and specifies the partition tree for matrix columns. Each node is associated with a contiguous cluster of columns in the stored matrix.

Note: Not used if the H2 matrix is symmetric.

12. basis_matrices_row

Value: JSON object array of size num_node_row. Each object has 3 name/value pairs describing the size information of the basis matrices U_i or R_i associated with nodes in the partition tree for matrix rows.

node

Value: Index of the node associated with this basis matrix.

Requirement: Non-negative integer, 0 <= node < num_node_row.
num_row

Value: Number of rows of the basis matrix.

Requirement: Non-negative integer.
num_col

Value: Number of columns of the basis matrix.

Requirement: Non-negative integer. num_col <= num_row.

13. basis_matrices_col

Value: JSON object array of size num_node_col. Each object has 3 name/value pairs describing the size information of the basis matrices V_i or S_i associated with nodes in the partition tree for matrix columns.

Note: Not used if the H2 matrix is symmetric.

14. num_admissible_blocks

Value: Number of admissible blocks B_ij stored.

Requirement: Non-negative integer.

Note: Blocks made redundant by symmetry are not counted.

15. num_inadmissible_blocks

Value: Number of inadmissible blocks D_ij stored.

Requirement: Non-negative integer.

Note: Blocks made redundant by symmetry are not counted.

16. has_partial_adm_blocks

Value: Whether or not the stored matrix defines partially admissible blocks.

Requirement: 0 or 1.

17. B_matrices

Value: JSON object array of size num_admissible_blocks. Each object has 5 name/value pairs describing the size information of the intermediate blocks B_ij with all admissible node pairs.

node_row

Value: Index of the row tree node that is associated with the rows of the B_ij matrix.

Requirement: Non-negative integer, 0 <= node_row < num_node_row.
node_col

Value: Index of the column tree node that is associated with the columns of the B_ij matrix.

Requirement: Non-negative integer, 0 <= node_col < num_node_col.
num_row

Value: Number of rows of the B_ij matrix.

Requirement: Non-negative integer.
num_col

Value: Number of columns of the B_ij matrix.

Requirement: Non-negative integer.
is_part_adm

Value: Whether this admissible node pair is defined to be partially admissible.

Requirement: 0 or 1.

18. D_matrices

Value: JSON object array of size num_inadmissible_blocks. Each object has 4 name/value pairs describing the information of dense inadmissible blocks D_ij with all inadmissible node pairs.

Note: For a symmetric H2 matrix, we require that the JSON file first stores the information of diagonal blocks D_ii, and then stores the information of off-diagonal blocks D_ij.

node_row

Value: Index i of the row tree node that is associated with the rows of the D_ij matrix.

Requirement: Non-negative integer, 0 <= node_row < num_node_row.
node_col

Value: Index $j$ of the row column node that is associated with the columns of the D_ij matrix.

Requirement: Non-negative integer, 0 <= node_col < num_node_col.
num_row

Value: Number of rows of the D_ij matrix.

Requirement: Non-negative integer.
num_col

Value: Number of columns of the D_ij matrix.

Requirement: Non-negative integer.

B. Binary data file

The binary data file stores the numerical values in the basis matrices, the B_ij matrices, and the D_ij matrices. Little-endian is used. All matrix entries are stored in row-major order.

1. Basis matrices for matrix rows

The first section of the binary file stores the basis matrices U_i and R_i for matrix rows. These basis matrices must be stored in the same order as stored in the JSON file (see A.12 basis_matrices_row above).

2. Basis matrices for matrix columns

The next section of the binary file stores the basis matrices V_i and S_i for matrix columns. These basis matrices must be stored in the same order as stored in the JSON file (see A.13 basis_matrices_col above).

Note: If the H2 matrix is symmetric, these basis matrices for matrix columns are not stored.

3. B matrices

The next section of the binary file stores the B_ij matrices. The B_ij matrices must be stored in the same order as stored in the JSON file (see A.17 above).

4. D matrices

The last section of the binary file stores the D_ij matrices. The D_ij matrices must be stored in the same order as stored in the JSON file (see A.18 above).

C. Optional auxiliary JSON file

An H2 matrix library can store additional information it provides, uses, or needs in an optional auxiliary JSON file. The following defines the auxiliary JSON file supported by H2Pack for symmetric kernel functions. The use of this file is optional and is only required to support additional functionality in H2Pack.

dim_point

Value: Dimension of points.

Requirement: Non-negative integer.
dim_kernel

Value: Dimension of kernel function.

Requirement: Non-negative integer, and nrow_matrix=dim_point*dim_kernel.

Examples: The Gaussian kernel has dimension 1 and the Stokes kernel has dimension 3.
num_point

Value: Number of points.

Requirement: Non-negative integer.
point_coordinate

Value: JSON object array of size num_point x dim_point containing the coordinates of all the points.

Requirement: Floating point numbers.

Note: The first dim_point values are the coordinate of the first point, the second dim_point values are the coordinate of the second point, and so on.
permutation_array

Value: JSON object array of size num_point containing the permutation of point_coordinate to match the ordering of the rows and columns of the stored H2 matrix.

Requirement: A permutation vector containing 0, 1, ..., num_point-1.

Note: Input points are often permuted before constructing the H2 matrix, and the values in cluster_head (see A.10.3) and cluster_tail (see A.10.4) are based on the permuted coordinates. The i-th value in the point permutation array is the original point index of the i-th permuted point.
skeleton_points

Value: JSON object array of size num_node. Each object has 3 name/value pairs below that contains the skeleton point set information associated with one node.
- node
  
  Value: Index of the node for the skeleton points being specified.
  
  Requirement: Non-negative integer, 0 <= node < num_node.
- num_skeleton_point
  
  Value: Number of skeleton points.
  
  Requirement: Non-negative integer.
- skeleton_point_indices
  
  Value: JSON integer array of size num_skeleton_point. Each integer is the index of a skeleton point.
  
  Requirement: these indices must be a subset of the point indices associated with the given node.

H2 Matrix File Storage Scheme

Overview

Data types

Notation

Example

Partially admissible blocks

A. Metadata JSON file

1. nrow_matrix

2. ncol_matrix

3. is_symmetric

4. num_node_row

5. num_node_col

6. root_node_row

7. root_node_col

8. num_level_row

9. num_level_col

10. nodes_row

11. nodes_col

12. basis_matrices_row

13. basis_matrices_col

14. num_admissible_blocks

15. num_inadmissible_blocks

16. has_partial_adm_blocks

17. B_matrices

18. D_matrices

B. Binary data file

1. Basis matrices for matrix rows

2. Basis matrices for matrix columns

3. B matrices

4. D matrices

C. Optional auxiliary JSON file

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Navigation

Getting Started

Advanced Configurations and Tools

Numerical Tests

Last But Not Least

Clone this wiki locally