Skip to content

Data pipeline doc miscellaneous  #10

@echeran

Description

@echeran

There are comments on the Data Pipeline doc, since it already got merged. Would like to hear feedback to first determine if they merit changes.

  • definition of "data version" - This definition describes the version of data as something that is "abstracted away from" the versions of the format and schema. I would think that the relationship is actually dependent on the schema, but still independent of the format. Since schema is the structure of the data, if the schema version changes, I would expect it to force the data version to change.

  • Data version - To the extent that this matters or makes sense, it would be more readable if the keys delineate "key segments" differently from multi-word segments. CLDR_37_alpha1 and FOO_1_1 are parsed differently, whereas CLDR-37-alpha1 and FOO-1_1 would be unambiguous.

  • Schema version / Data version - If we allow the data provider to choose which version(s)' worth of data to hold, then it's possible for a user to call data for a key+version which is not supported (maybe the version is too old/new, or the key has changed due to schema change). Do we have a description of how we handle that? We could just make it easy and return null / throw error. I suppose a data provider can be configured to fetch from an authoritative service with all versions of all data (depicted in the diagram?), which makes it a data provider decision/configuration.

Metadata

Metadata

Assignees

Labels

A-designArea: Architecture or designC-data-infraComponent: provider, datagen, fallback, adaptersT-docs-testsType: Code change outside core library

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions