-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Iceberg Materialized View Spec
Materialized View Spec google doc:
https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?usp=sharing
Background and Motivation
A materialized view is a database entity that stores a query definition as a logical table. The query is precomputed and the resulting data is served when the materialized view is queried. The cost of query execution is pushed to the precomputation step and is amortized over the query executions.
Goal
A common metadata format for materialized views enabling materialized views to be created, read and updated by different query engines.
Overview
The metadata of a MV (materialized view) consists of a query definition, a pointer to the precomputed data and lineage information to check if the MV is outdated. Iceberg materialized views are realized as a combination of an iceberg view with an underlying iceberg table that stores the precomputed data. The iceberg view is referred to as the “common view”, while the table is labeled “storage table”. The query definition of the materialized view is stored in the common view. The precomputed data is stored in the storage table. The storage table can have the states “fresh”, “outdated” or “invalid”. The lineage information is composed of data about the so-called “source tables”, which are the tables referenced in the query definition of the materialized view.
classDiagram
View --|> Table
class View{
+query definition
+storage table pointer
}
class Table{
+precomputed data
+lineage information
}
Operations on Materialized Views
Creation of materialized views
When a materialized view is created, an iceberg view with the query definition and an iceberg table to store the precomputed data is created. The storage table pointer of the view is set to the storage table.
Determining the storage table state
To determine the storage table state the current-version-id of the view is compared to the refresh-version-id in the lineage struct of the current storage table snapshot. If the versions don’t match, the storage table state is invalid.
If the versions match, the snapshot ids of the last refresh operation, which are stored in the lineage information, are compared to the current snapshot ids of the source tables. If the snapshot_id’s match for all source tables, the storage table state is fresh, otherwise it is outdated.
Refresh of materialized views
The exact mechanism for triggering a refresh is left up to individual systems. It might be triggered in a variety of ways. Some examples:
- Event driven if a change is detected to a source table
- At query time if a the precomputed data is determined to be stale
- On a regular schedule specified by a user.
- By a manual operation invoked by a user
The mechanism to perform the refresh operation uses the storage table state, which is determined by the procedure described above, to decide which action to take.
- If the storage table state is fresh, no action is required
- If the storage table state is invalid, a full refresh is required
- If the storage table state is outdated, it is left to the query engine to decide whether to perform a full or incremental refresh. The query engines specifies which refresh strategy was used in the snapshot summary of the storage table.
Query of materialized views
As a first step the storage table state is determined by following the procedure described above.
- If the storage table state is fresh, the precomputed data in the storage table is returned.
- If the storage table state is invalid, the precomputed data must not be used
- If the materialized view is outdated, the view property materialization.data.allow-stale decides whether a refresh operation has to be performed before the precomputed data is returned.
Commit Procedure
The catalog provides the basis for making atomic changes to the view metadata. Readers use the version of the view that was current when they loaded the view metadata and are not affected by changes until they refresh and pick up the new metadata.
Writers distinguish between changing the view or the storage table state.
Writers changing the view state, perform an “update view” operation using the catalog, optimistically assuming the view hasn’t changed since the metadata was loaded.
Writers changing the storage table state perform the commit in two steps. First, the writer creates a new table metadata file optimistically, assuming that the storage-table-pointer of the view will not be changed before the writer’s commit. It then updates the storage-table-pointer to the new location of table metadata. Second, the new view metadata gets committed to the catalog, requiring that the current storage-table-pointer of the catalog is equal to the previous storage-table-pointer. The commit is only successful when the second step succeeds.
Specification (Draft)
Terms
- Common view - Iceberg view that stores the query definition and a pointer to the precomputed data. It represents the main entity for the materialized view.
- Storage table - Iceberg table that stores the precomputed data of the materialized view.
- Storage table pointer - Reference to the metadata.json file of the storage table. It is stored as part of the common view and is used to obtain the metadata of the storage table.
- Source table - A table reference that occurs in the query definition of the materialized view. The materialized view depends on the data from the source table.
Metadata
The metadata for a materialized view can be considered an extension to the view metadata. To account for possible precomputed data a materialization field is added to the view spec. The materialization field stores the “storage table pointer” which references the location of the metadata.json file of the storage table. The following table shows the view metadata according to the v1 view spec compared to this proposal.
| v1 | materialized view | Field name | Description |
|---|---|---|---|
| required | required | view-uuid |
A UUID that identifies the view, generated when the view is created. Implementations must throw an exception if a view's UUID does not match the expected UUID after refreshing metadata |
| required | required | format-version |
An integer version number for the view format; must be 1 |
| required | required | location |
The view's base location; used to create metadata file locations |
| required | required | schemas |
A list of known schemas |
| required | required | current-version-id |
ID of the current version of the view (version-id) |
| required | required | versions |
A list of known versions of the view |
| required | required | version-log |
A list of version log entries with the timestamp and version-id for every change to current-version-id |
| optional | required | properties |
A string to string map of view properties |
| optional | materialization |
The storage table pointer[1]; used to retrieve the precomputed data from the storage table. If the value is null the entity is a common view, otherwise it is a materialized view |
Notes:
- Storage table pointer: The metadata of the storage table is stored in a metadata.json file and must not be stored in a catalog. The storage table pointer contains the location of the metadata.json file.
Storage Table Metadata
The lineage information, that is required to determine whether the storage table contains fresh data, is stored in the form of a lineage record in each storage table snapshot. The following table shows the snapshot metadata according to the v2 table spec compared to this proposal.
| v2 | materialized view | Field | Description |
|---|---|---|---|
| required | required | snapshot-id |
A unique long ID |
| optional | optional | parent-snapshot-id |
The snapshot ID of the snapshot's parent. Omitted for any snapshot with no parent |
| required | required | sequence-number |
A monotonically increasing long that tracks the order of changes to a table |
| required | required | timestamp-ms |
A timestamp when the snapshot was created, used for garbage collection and table inspection |
| required | required | manifest-list |
The location of a manifest list for this snapshot that tracks manifest files with additional metadata |
| required | required | summary |
A string map that summarizes the snapshot changes, including operation (see below) |
| optional | optional | schema-id |
ID of the table's current schema when the snapshot was created |
| optional | lineage |
A lineage record containing freshness information for materialized views. Optional field that only applies if the table is a storage table for a materialized view. |
Lineage record
The lineage record has the following fields:
| v1 | Field Name | Description |
|---|---|---|
| required | refresh-version-id |
Version id of the materialized view when the refresh operation was performed. |
| required | source-tables |
A List of source-table records. |
Source table record
The source table record has the following fields:
| v1 | Field Name | Description |
|---|---|---|
| required | uuid |
Uuid of the source table. |
| required | identifier |
A full identifier record containing catalog, namespace and table-name fields. |
| required | snapshot-id |
Snapshot id of the source table when the refresh operation was performed. |
Full Identifier record
The source table record has the following fields:
| v1 | Field Name | Description |
|---|---|---|
| required | catalog |
Catalog of the source table |
| required | namespace |
Namespace of the source table |
| required | table-name |
Name of the source table |
View Properties
Additional view properties that allow the configuration of the materialized view:
| Property | Default | Description |
|---|---|---|
materialization.data.allow-stale |
false |
Boolean that defines the query engine's behavior in case the source tables indicate the precomputed data isn't fresh. If set to false, a refresh operation has to be performed before the query results are returned. If set to true the data in the storage table gets returned without performing a refresh operation. |
Query engine metadata
To enable query engines to write metadata for logging and debugging, the query engine can write the following fields to the snapshot summary of the storage table:
| Property | Description |
|---|---|
materialization-refresh-strategy |
Which refresh strategy was used to perform the refresh operation. Can either be “FULL” or “INCREMENTAL”. |