diff --git a/safety-related-profile/documents/ONNX_format_comments_v0.1.docx b/safety-related-profile/documents/ONNX_format_comments_v0.1.docx new file mode 100644 index 00000000..489005ed Binary files /dev/null and b/safety-related-profile/documents/ONNX_format_comments_v0.1.docx differ diff --git a/safety-related-profile/documents/README.md b/safety-related-profile/documents/README.md new file mode 100644 index 00000000..d562e662 --- /dev/null +++ b/safety-related-profile/documents/README.md @@ -0,0 +1,4 @@ +This folder contains the working documents produced by the workgroup: +- Our [Statement of Work](sow.md) +- Some first [elements of analysis of ONNX](ONNX_format_comments_v0.1.docx) +- An example of [operator specification (CONV)](./conv_specification_example/README.md) (this is a work in progress) diff --git a/safety-related-profile/documents/WG_objectives_and_organisation.docx b/safety-related-profile/documents/WG_objectives_and_organisation.docx new file mode 100644 index 00000000..67196ce0 Binary files /dev/null and b/safety-related-profile/documents/WG_objectives_and_organisation.docx differ diff --git a/safety-related-profile/documents/conv_specification_example/README.md b/safety-related-profile/documents/conv_specification_example/README.md new file mode 100644 index 00000000..944cc248 --- /dev/null +++ b/safety-related-profile/documents/conv_specification_example/README.md @@ -0,0 +1,927 @@ +# Preamble + +(This section is for information only. It is *not* part of the +specification.) + +### Specification principles + +The specification is written with the following *principles* in mind. +Note that these principles will eventually be replaced by the industrial +needs that will be captured in phase 1 of our work. + +1. We are only concerned with the syntactic and semantics elements + concerning the **inference**. All ONNX constructs dealing with + training are out of the scope of the Safety-Related Standard. + +2. We distinguish an *informal* and a *formal* specification. The + *informal* part is aimed at facilitating the understanding of the + ONNX construct (e.g., operator). It may be incomplete, as far as it + is clearly indicated. Conversely, the *formal* specification shall + be absolutely complete and non ambiguous. + +3. Using mathematical formalism shall be avoided if not required. Since + we are essentially[^1] targeting data and computer scientists, using + a very cryptic – yet perfectly well defined and mathematically + grounded – notation may reveal being error prone and, consequently, + counter effective. + +4. The specification can rely on a formal language, as far as this + language does not violate rule (2) above. + +### Open questions + +1. Can’t we use the ONNX reference implementation and complete it, + clarify it, rather than writing something completely new? Indeed, + wouldn’t it be simpler and possibly more efficient and less error + prone to provide a well-documented (and reviewed) Python + implementation rather than re-implementing things using a specific + (possibly formal) language? In particular, the semantics must be + defined with respect to a actual ONNX model (i.e., a protobuf file). + +2. What do we plan to do with the formal specification? + +3. Would it be wise to define a textual representation of an ONNX model + in order to simplify specifying the formal semantics? + +### References + +- About Why3, used to write the formal specification, see + . + +- About the ONNX semantics, see the “[Open Neural Network Exchange + Intermediate Representation + Specification](https://onnx.ai/onnx/repo-docs/IR.html)”. This document + does not really explain in details how the graph is executed, but the + execution semantics is basically to (i) sort the operators + topologically, (ii) execute each operator. + +# Conventions + +- Notation $X.H$ (resp. $X.W$) denotes the *height* (resp. *width*) of + tensor $X$. + +- “height” (resp. “width”) denotes the first (resp. second) spatial + axis. + +- inputs, outputs and attributes are represented using a non-serif font + (e.g., `pads`). + +# `conv` operator + +### Restrictions + +The following limitations apply to the `conv` operator of the +safety-related profile: + +- the number of spatial axes of the tensors is equal to 2. + +- there is no grouping (i.e., $\mbox{\texttt{group}}=1$), so the `conv` + operator does a standard convolution. + +In the rest of the document, restrictions with respect to the ONNX +standard `conv` operator are indicated in the text with the tag. + +### Signature + +`Y = conv(X,W,[B])` + +where + +- `Y` is the output tensor of the convolution + +- `X` is the input tensor to be convoluted with the convolution kernel + `W` + +- `W` is the convolution kernel + +- `B` is the optional bias to be added to the result of the convolution. + +### Specification for data types: `Y` : real, `X`: real, `W`: real, `B`: real + +#### Informal specification + +The `conv` operator computes the convolution of the input tensor `X` +with the “filter”, or “kernel”, `W` and adds bias `B` to the result. + +A simplified mathematical definition of the operator is given hereafter +for the 2D case, without padding and with $\mbox{\texttt{group}}=1$ . +The formal specification is given in +Section 3.5. When considering padding, the +same formula applies, in which `X` is represents the padded version of +the actual input `X`. + +$$\begin{gathered} + \mbox{\texttt{Y}}[b, c, m, n] = \sum_{c_i=0}^{W.C_{in}-1} \sum_{k_h=0}^{W.H-1} \sum_{k_w=0}^{W.W-1} \\ (\mbox{\texttt{X}}[b,c_i,m \cdot \mbox{\texttt{strides[0]}}+k_h\cdot \mbox{\texttt{dilation[0]}}, n \cdot \mbox{\texttt{strides[1]}}+k_w\cdot \mbox{\texttt{dilations[1]}}] \times \mbox{\texttt{W}}[c, c_i, k_h, k_w]) \\ + \mbox{\texttt{B}}[c_i] +\end{gathered}$$ + +Where + +- $b$ is the batch index, $b \in [0,Y.B-1]$ and $Y.B$ is the batch size of + the output `Y` + +- $c$ is the data channel, $c \in [0,Y.C-1]$ and $Y.C$ is the number of data + channels of the output `Y` + +- $m \in [0,Y.H-1]$ (resp. $n \in [0,Y.W-1]$) is the index of the + first (resp. second) spatial axis of the output `Y` + +- $W.C_{in}$ is the number of input channels of kernel `W` + +- $W.H$ and $W.W$ are the sizes of the two spatial axis of tensor `W` + +- attributes `strides` and `dilations` are described later in this + section. + +The effect of the operator is depicted on the following picture. +![](https://github.com/ericjenn/working-groups/blob/ericjenn-srpwg-wg1/safety-related-profile/documents/conv_specification_example/imgs/conv.png) + + +#### Inputs and outputs + +##### `X` + +`X` is the input tensor on which convolution with kernel `W` is +computed. + +The shape of tensor `X` is $(X.B \times X.C \times X.H \times X.W)$ +where + +- $X.B$ is the batch size + +- $X.C$ is the number of data channels + +- $X.H$ and $X.W$ are the sizes of the tensor for the two spatial axis + +###### Constraints + +1. Axis of tensor `X` + + - Statement: The number of spatial axis of tensor `X` is 2. + + - Rationale: We only consider 2 spatial axes. + +2. Consistency between the number of channels of `X`, `W`, `B`, and + `Y`. + + - Statement: $X.C=Y.C=W.C_{in}=B.C$ + + - Rationale: This is a particular case of the more general relation + $X.C=Y.C=W.C_{in}\cdot\mbox{\texttt{groups}}$ when + $\mbox{\texttt{groups}}=1$. + +3. Consistency between the shape of tensors `X`, `W`, `Y` and + attributes `pads`, `dilations` and `strides` + + - Statement: If parameter `pads` is not empty + + * $$\lfloor{\frac{L.H-(\mbox{\texttt{dilations[0]}} \times W.H-1)}{\mbox{\texttt{stride[0]}}}} \rfloor +1 = \mbox{\texttt{Y.H}} \mbox{ with } L.H=X.H+\mbox{\texttt{pads[0]}}+\mbox{\texttt{pads[2]}}$$ + + and + + * $$\lfloor{\frac{L.W-(\mbox{\texttt{dilations[1]}} \times W.W-1)}{\mbox{\texttt{stride[1]}}}} \rfloor +1 = \mbox{\texttt{Y.W}} \mbox{ with } L.W=X.W+\mbox{\texttt{pads[1]}}+\mbox{\texttt{pads[3]}}$$ + + - Rationale: The size of the output is determined by the number of + times the kernel can be applied on a given spatial axis. + +4. Axis denotations + + - Statement: If axis denotation is in effect, the operation expects + input data tensor to arrive with the axis denotation of + \[`DATA_BATCH`, `DATA_CHANNEL`, `DATA_FEATURE`, `DATA_FEATURE`\]. + + - Rationale: Denotation convention + +##### `W` + +Tensor `W` is the convolution kernel. The shape of the kernel is +$(W.C_{out} \times W.C_{in} \times W.H \times W.W)$, where + +- $W.C_{out}$ is the number of output channels or number of feature maps + +- $W.C_{in}$ is the number of input channels, + +- $W.H$ and $W.W$ are the sizes of the kernel for the two spatial axis. + +Optionally, + +###### Constraints + +1. Consistency between the number of channels of `X`, `W`,`B`, and `Y`, + [see constraint (2) of X](#channel_consist) + +2. Consistency between the shape of tensors `X`, `W`, `Y` and + attributes `pads`, `dilations` and `strides`. [See constraint (3) of X](#shape_consist) + + +3. Axis denotations + + - Statement: If axis denotation is in effect, the operation expects + the weight tensor to arrive with the axis denotation of + \[`FILTER_OUT_CHANNEL`, `FILTER_IN_CHANNEL`, `FILTER_SPATIAL`, + `FILTER_SPATIAL`\]. + + - Rationale: Denotation convention + +##### `B`, optional + +Tensor `B` is the optional bias. The shape of the bias tensor is +$(B.C \times 1)$. + + +###### Constraints + +1. Consistency between the number of channels of `X`, `W`,`B`, and `Y`, + [see constraint (2) of X](#channel_consist) +#### Attributes + +##### `strides`: integer (optional, default value 1) + +The `strides` attributes determines how the kernel is moved on the input +tensor `X` during the convolution. + +For instance, with $\mbox{\texttt{strides}}[0]=1$ and +$\mbox{\texttt{strides}}[1]=2$, the kernel is moved of 1 unit in the +first spatial axis and 2 units in the second spatial axis at each step +of the convolution. + +This effect is illustrated on the following figure: + +![](https://github.com/ericjenn/working-groups/blob/ericjenn-srpwg-wg1/safety-related-profile/documents/conv_specification_example/imgs/conv_stride.png) + +###### Constraints + +1. Size of `strides` + + - Statement: If non empty, the number of elements in the `strides` + attribute must be equal to 2. + + - Rationale: Striding is done on each spatial axis. + +2. Consistency between the shape of tensors `X`, `W`, `Y` and + attributes `pads`, `dilations` and `strides`. [See constraint (3) of X](#shape_consist) + +##### `auto_pad` : string (default is `NOTSET`) + +The `auto_pad` attribute determines if and how automatic padding is done +for the input tensor X. + +If it is not set or set to `NOTSET`, padding is determined by the pads +attribute (see below). Otherwise, padding is done according to the +`auto_pad` value, as follows: + +- if auto\_pad = VALID: no padding is done. + +- if auto\_pad = NOTSET: padding is done according to the `pads` attribute. If attribute `pads` is not + set, it takes its default value, i.e., $(0,0,0,0)$. In that case, the result is identical to the one that would be obtained if + auto\_pad = VALID (i.e., no padding + is done). + +- if auto\_pad = SAME\_UPPER: for each axis, padding must be added so that [constraint (3) of X](#shape_consist) holds. + * $$\lfloor {\frac{L.H-W.H}{\mbox{\texttt{stride[0]}}}} \rfloor +1 = \mbox{\texttt{Y.H}} \mbox{ with } L.H=X.H+pad_h$$ + + and + + * $$\lfloor {\frac{L.W-W.W}{\mbox{\texttt{stride[1]}}}} \rfloor +1 = \mbox{\texttt{Y.W}} \mbox{ with } L.W=X.W+pad_w$$ + + If the total padding $pad_h$ (resp. $pad_w$) is even then padding shall be + + - $pad_h/2$ (resp. $pad_w/2$) at the beginning + + and + + - $pad_h/2$ (resp. $pad_w/2$) at the end. + + otherwise padding shall be + + - $\lfloor{pad_h/2} \rfloor$ (resp. $\lfloor{pad_w/2} \rfloor$) at the beginning + + and + + - $\lfloor{pad_h/2}\rfloor+1$ (resp. $\lfloor{pad_w/2}\rfloor+1$) at the end. + +- auto\_pad = SAME\_LOWER: For each axis, padding must be added so that [constraint (3) of X](#shape_consist) holds. + * $$\lfloor {\frac{L.H-W.H}{\mbox{\texttt{stride[0]}}}} \rfloor +1 = \mbox{\texttt{Y.H}} \mbox{ with } L.H=X.H+pad_h$$ + + and + + * $$\lfloor {\frac{L.W-W.W}{\mbox{\texttt{stride[1]}}}} \rfloor +1 = \mbox{\texttt{Y.W}} \mbox{ with } L.W=X.W+pad_w$$ + + If the total padding $pad_h$ (resp. $pad_w$) is even then padding shall be + + - $pad_h/2$ (resp. $pad_w/2$) at the beginning + + and + + - $pad_h/2$ (resp. $pad_w/2$) at the end. + + otherwise padding shall be + + - $\lfloor{pad_h/2+1} \rfloor$ (resp. $\lfloor{pad_w/2+1} \rfloor$) at the beginning + + and + + - $\lfloor{pad_h/2} \rfloor$ (resp. $\lfloor{pad_w/2} \rfloor$) at the end. + +The effect of the `auto_pad` attribute is illustrated on the following figure: + +![](https://github.com/ericjenn/working-groups/blob/ericjenn-srpwg-wg1/safety-related-profile/documents/conv_specification_example/imgs/autopad.png) + +###### Constraints + +1. Value domain + + - Statement: The value of attribute `auto_pad` shall be in + `"NOTSET"`, `"SAME_UPPER"`, `"SAME_LOWER"`, `"VALID"`. + +2. Consistency between `pads` and `auto_pad` + + - Statement: If attribute `pads` is not empty, attribute `auto_pad` + shall be either empty or set to `NOTSET`. + + - Rationale: Padding is either defined by attribute `pads` or + computed automatically, in an exclusive manner. + +##### `pads`: list of int (optional, default value is 0 along start and end of each spatial axis). + +The `pads` attribute determines the padding at the beginning and ending +along each spatial axis of the input tensor `X`. + +The value represents the number of elements added to the beginning and +end part of the corresponding axis. The `pads` is a list of the form +(`x1_begin`, `x2_begin`,..., `x1_end`, `x2_end`,...), where `xi_begin` +is the number of elements added at the beginning of axis $i$ and +`xi_end` is the number of elements added at the end of axis $i$. If not +present, the padding defaults to 0 for the beginning and end of each +spatial axis. + +The value of the elements added by the padding is 0. + +The effect of padding illustrated on the following figure: + +![](https://github.com/ericjenn/working-groups/blob/ericjenn-srpwg-wg1/safety-related-profile/documents/conv_specification_example/imgs/pad.png) + + +###### Constraints. + +1. Consistency between `pads` and `auto_pad`, see [constraint (2)](#pads_autopad_consist) + +2. Value domain + + - Statement: Each element of the `pads` list shall be greater or + equal to 0 + + - Rationale: A padding value gives a number of elements to be added + to some spatial axis. This is strictly positive[^2]. + +3. Consistency between the shape of `X` and the length of `pads` + + - Statement: The length of the `pads` list shall be equal to 2 times + the number of spatial axes + + - Rationale: If padding is given, it shall be given for all spatial + axes. + +4. Length of `pads` + + - Statement: The length of the `pads` list shall be a multiple of 2 + + - Rationale: For each axis, two values must be given: one for the + beginning and one for the end. + +5. Consistency between the shape of tensors `X`, `W`, `Y` and + attributes `pads`, `dilations` and `strides`. [See constraint (3) of X](#shape_consist) + +##### `dilations`: list of ints (default is 1 along each spatial axis) + +The `dilations` attribute specifies the spacing between the kernel +elements for each spatial axis of the filter `W`. It is a list of +non-null integer values where each value gives the dilation factor for +spatial axis $i$. If the dilation factor is greater than 1 for a axis +$i$, then the kernel points are spaced out by the dilation factor. + +The effect of the `dilations` attribute for a tensor with two spatial axes is depicted on the following figure: + +![](https://github.com/ericjenn/working-groups/blob/ericjenn-srpwg-wg1/safety-related-profile/documents/conv_specification_example/imgs/dilation.png) + +###### Constraints + +1. (C1) Domain of `dilations` values + + - Statement: All values in the `dilation` attribute shall be an + integer value strictly greater than 0 + + - Rationale: ONNX accepts dilations equal to 0 or negative... + + + +2. (C2) Relation between `dilations` and `W` + + - Statement: If the `dilations` attribute is not empty, its length + shall be equal to the number of spatial axes of `W`. + + - Rationale: [NOTE:] Dilations shall be defined for all spatial axes of `W`. + +3. Consistency between the shape of tensors `X`, `W`, `Y` and + attributes `pads`, `dilations` and `strides`. [See constraint (3) of X](#shape_consist) + +##### `group`: int (default value is 1) + +This parameter specifies the number of groups input channels and output +channels are divided into. When group is set to 1, this is the standard +convolution operation. + +For information, when group is greater than 1, convolution is computed +for each group separately with a specific set of filters. + + +[NOTE:] I have kept the following text for it may be possible that we allow +`group` greater than 1... This limitation has been chosen to simplify +this first specification work. This constraint may be relaxed in the +actual Safety-Related Profile. + + +The effect of the `group` attribute for a tensor with two spatial axes is depicted on the following figure: + +![](https://github.com/ericjenn/working-groups/blob/ericjenn-srpwg-wg1/safety-related-profile/documents/conv_specification_example/imgs/grouped_convolution.png) + +(Taken from https://eli.thegreenplace.net/2018/depthwise-separable-convolutions-for-machine-learning) + + +For example, with `group` set to 2 and an input `X` and an output `Y` +with 6 channels, the input and output channels will be divided into 2 +groups of 3 channels. + +###### Constraints + +1. : Value domain + + - Statement: The number of groups shall be equal to 1. + + - Rationale: Only standard convolutions are considered in the SRP + profile, + +##### `kernel_shape`: list of ints (optional, default value is the list of sizes of the spatial axis of W + +This parameter specifies the shape of the convolution kernel `W`. + +###### Constraints. + +1. Value domain + + - Statement: All elements of `kernel_shape` must be integers greater + or equal to 1 + + - Rationale: A size is always positive. + +2. Consistency between `W` and `[`kernel_shape + + - Statement: If set, the values of `kernel_shape` for a given axis + must be equal to the size of `W` for that axis. + + - Rationale: `kernel_shape` represents the shape of `W`. + +#### Outputs + +##### `Y` + +The size of the output `Y` will be +$(Y.B \times Y.C \times Y.H \times Y.W)$ where + +- $Y.B$ is the number of batches + +- $Y.C$ is the number of channels, + +- $Y.H$ and $Y.W$ are the sizes of the output for the two spatial axis. + +###### Constraints. + +1. Consistency between the shape of tensors `X`, `W`, `Y`, attributes + `pads` and `strides`, [see constraint (3) of X](#shape_consist) + +#### Formal specification + +The following code specifies the `conv` operator using the Why3 +language[^3]. + +###### Nota: the specification does not cover all attributes values. Currently, there is no padding (`pads` is not set and `auto_pad = NOTSET`) and `dilations` is not set. + +``` ocaml +module Conv + use int.Int + use real.RealInfix + use array.Array + use int.ComputerDivision + use real.Truncate + use real.FromInt + + type input_tensor = { + x: array real; + x_h: int; + x_w: int; + x_b: int; + x_c: int; + } + + type convolution_kernel = { + w: array real; + w_h: int; + w_w: int; + w_c_in: int; + w_c_out: int; + } + + type bias_tensor = { + b: array real; + b_c: int; + } + + type attributes = { + stride: array int; + pads: array int; + auto_pad: int; + dilations: array int; + } + + type output_tensor = { + + y_b: int; + y_c: int; + y_h: int; + y_w: int; + + } + + function conv_size (out: output_tensor) : int = + out.y_b * out.y_c * out.y_h * out.y_w + + predicate conv_result + (inp: input_tensor) + (kernel: convolution_kernel) + (bias: bias_tensor) + (attr: attributes) + (out: output_tensor) + (res: array real) + (bi ci hi wi: int) + (ci_in ki_h ki_w: int) = + let y_idx = bi * (out.y_c * out.y_h * out.y_w) + ci * (out.y_h * out.y_w) + hi * out.y_w + wi in + let x_h_idx = hi * attr.stride[0] + ki_h * attr.dilations[0] - attr.pads[0] in + let x_w_idx = wi * attr.stride[1] + ki_w * attr.dilations[1] - attr.pads[1] in + + (0 <= x_h_idx < inp.x_h /\ 0 <= x_w_idx < inp.x_w) -> + let x_idx = bi * (inp.x_c * inp.x_h * inp.x_w) + ci_in * (inp.x_h * inp.x_w) + x_h_idx * inp.x_w + x_w_idx in + let w_idx = ci * (kernel.w_c_in * kernel.w_h * kernel.w_w) + ci_in * (kernel.w_h * kernel.w_w) + ki_h * kernel.w_w + ki_w in + res.elts (y_idx) = bias.b[ci] +. (inp.x[x_idx] *. kernel.w[w_idx]) + + val conv (inp: input_tensor)(kernel: convolution_kernel)(bias: bias_tensor)(attr: attributes)(out: output_tensor): array real + requires{inp.x_c = out.y_c = kernel.w_c_in = bias.b_c} + requires{out.y_h = (div (inp.x_h + attr.pads[0] + attr.pads[2] - (attr.dilations[0] * kernel.w_h)) attr.stride[0]) + 1} + requires{out.y_w = (div (inp.x_w + attr.pads[1] + attr.pads[3] - (attr.dilations[1] * kernel.w_w)) attr.stride[1]) +1} + requires { inp.x_h > 0 /\ inp.x_w > 0 /\ inp.x_c > 0 /\ inp.x_b > 0} + requires{kernel.w_h > 0 /\ kernel.w_w > 0 /\ kernel.w_c_in > 0 /\ kernel.w_c_out > 0} + requires { out.y_b > 0 /\ out.y_c > 0 /\ out.y_h > 0 /\ out.y_w > 0} + requires { length inp.x = inp.x_h * inp.x_w * inp.x_c * inp.x_b} + requires{length kernel.w = kernel.w_h * kernel.w_w * kernel.w_c_in * kernel.w_c_out} + requires{inp.x_h >= kernel.w_h} + requires{inp.x_w >= kernel.w_w} + requires{length bias.b = bias.b_c} + requires{length attr.stride = 2} + requires{length attr.dilations = 2} + requires{length attr.pads = 4} + requires{forall i. 0 <= i < length attr.pads -> attr.pads[i] = 0} + requires{forall j. 0 <= j < length attr.dilations -> attr.dilations[j] >= 1} + requires{forall k. 0 <= k < length attr.stride -> attr.stride[k] >= 1} + ensures { length result = conv_size out } + ensures { forall bi ci hi wi ci_in ki_h ki_w: int. + 0 <= bi < out.y_b -> + 0 <= ci < out.y_c -> + 0 <= hi < out.y_h -> + 0 <= wi < out.y_w -> + 0 <= ci_in < kernel.w_c_in -> + 0 <= ki_h < kernel.w_h -> + 0 <= ki_w < kernel.w_w -> conv_result inp kernel bias attr out result bi ci hi wi ci_in ki_h ki_w } + +end + + +module Test_conv + use int.Int + use real.RealInfix + use array.Array + use int.ComputerDivision + use real.Truncate + use real.FromInt + use Conv + +let test_conv () = + let inp_x = Array.make 9 1.0 in + let inp = { x = inp_x; x_h = 3; x_w = 3; x_b = 1; x_c = 1 } in + + let kernel_w = Array.make 4 0.0 in + let kernel = { w = kernel_w; w_h = 2; w_w = 2; w_c_in = 1; w_c_out = 1 } in + + let bias_b = Array.make 1 0.5 in + let bias = { b = bias_b; b_c = 1 } in + let stride = Array.make 2 1 in (* Stride of 1 *) + let pads = Array.make 4 0 in (* No padding *) + let dilations = Array.make 2 1 in (* Dilation of 1 *) + let attr = { stride = stride; pads = pads; auto_pad = 0; dilations = dilations } in + let out_h = (div (inp.x_h + pads[0] + pads[2] - (dilations[0] * kernel.w_h)) stride[0]) + 1 in + let out_w = (div (inp.x_w + pads[1] + pads[3] - (dilations[1] * kernel.w_w)) stride[1]) + 1 in + let out = { y_b = 1; y_c = 1; y_h = out_h ; y_w = out_w } in + (* Call the conv function *) + let result = conv inp kernel bias attr out in + let actual_result = result.elts 0 in + assert { conv_result inp kernel bias attr out result 0 0 0 0 0 0 0} ; + assert { actual_result = 0.5 } ; + () + +end +``` + +Another formal specification of the `conv` operator using Frama-C +language[^4] is presented below. + +``` objectivec +#include +#include +#include + +/* Data Structures */ +typedef struct { + float *x; + int x_h, x_w, x_b, x_c; +} input_tensor; + +typedef struct { + float *w; + int w_h, w_w, w_c_in, w_c_out; +} convolution_kernel; + +typedef struct { + float *b; + int b_c; +} bias_tensor; + +typedef struct { + int *stride; + int *pads; + int *dilations; +} attributes; + +typedef struct { + float *y; + int y_b, y_c, y_h, y_w; +} output_tensor; +/*@ + requires \valid_read(pads + (0 .. 3)); + requires \valid_read(stride + (0 .. 1)); + requires \valid_read(result + (0 .. 3)); + requires x_h > 0 && x_w > 0 && w_h > 0 && w_w > 0 && y_h > 0 && y_w > 0; + assigns result[0 .. 3]; + behavior empty_or_notset: + assumes (auto_pad == "") || (auto_pad == "NOTSET"); + ensures \forall integer i; 0 <= i < 4 ==> result[i] == pads[i]; + +behavior valid: + assumes (auto_pad == "VALID") ; + ensures \forall integer i; 0 <= i < 4 ==> result[i] == 0; + +behavior same_upper: + assumes (auto_pad == "SAME_UPPER") ; + ensures \let pad_h = (y_h - 1) * stride[0] + w_h - x_h; + \let pad_w = (y_w - 1) * stride[1] + w_w - x_w; + ((pad_h % 2 == 0) && (pad_w % 2 == 0)) ==> + (result[0] == (pad_w / 2) && result[1] == (pad_h / 2) && result[2] == (pad_w / 2) && result[3] == (pad_h / 2)) && + (pad_h % 2 != 0) && (pad_w % 2 != 0) ==> + (result[0] == (pad_w / 2) && result[1] == (pad_h / 2) && result[2] == ((pad_w / 2) + 1) && result[3] == ((pad_h / 2) + 1)); + +behavior same_lower: + assumes (auto_pad == "SAME_LOWER"); + ensures \let pad_h = (y_h - 1) * stride[0] + w_h - x_h; + \let pad_w = (y_w - 1) * stride[1] + w_w - x_w; + ((pad_h % 2 == 0) && (pad_w % 2 == 0)) ==> + (result[0] == (pad_w / 2) && result[2] == (pad_w / 2) && result[1] == (pad_h / 2) && result[3] == (pad_h == 2)) && + ((pad_h % 2 != 0) && (pad_w % 2 != 0)) ==> + (result[0] == ((pad_w / 2) + 1) && result[1] == ((pad_h / 2) + 1) && result[2] == (pad_w / 2) && result[3] == (pad_h / 2)); +complete behaviors empty_or_notset, valid, same_upper, same_lower; +disjoint behaviors empty_or_notset, valid, same_upper, same_lower; +*/ +void compute_pad(const char* auto_pad, int pads[4], int stride[2], int x_h, int x_w, int w_h, int w_w, int y_h, int y_w, int result[4]) { + int pad_h, pad_w; + + if ((auto_pad == "") || (auto_pad == "NOTSET")) { + for (int i = 0; i < 4; i++) { + result[i] = pads[i]; + } + } else if ((auto_pad == "VALID")) { + for (int i = 0; i < 4; i++) { + result[i] = 0; + } + } else if ((auto_pad == "SAME_UPPER")) { + pad_h = (y_h - 1) * stride[0] + w_h - x_h; + pad_w = (y_w - 1) * stride[1] + w_w - x_w; + + if ((pad_h % 2 == 0) && (pad_w % 2 == 0)) { + result[0] = result[2] = pad_w / 2; + result[1] = result[3] = pad_h / 2; + } else if ((pad_h % 2 != 0) && (pad_w % 2 != 0)) { + result[0] = pad_w / 2; + result[1] = pad_h / 2; + result[2] = (pad_w / 2) + 1; + result[3] = (pad_h / 2) + 1; + } + } else if ((auto_pad == "SAME_LOWER")) { + pad_h = (y_h - 1) * stride[0] + w_h - x_h; + pad_w = (y_w - 1) * stride[1] + w_w - x_w; + + if ((pad_h % 2 == 0) && (pad_w % 2 == 0)) { + result[0] = result[2] = pad_w / 2; + result[1] = result[3] = pad_h / 2; + } else if ((pad_h % 2 != 0) && (pad_w % 2 != 0)) { + result[0] = (pad_w / 2) + 1; + result[1] = (pad_h / 2) + 1; + result[2] = pad_w / 2; + result[3] = pad_h / 2; + } + } +} + +/* Function to compute the convolution */ +//void compute_pad(int auto_pad, int *pads, int *stride, int x_h, int x_w, int w_h, int w_w, int y_h, int y_w, int *result); + +/*@ + requires \valid_read(inp.x + (0..(inp.x_h*inp.x_w*inp.x_b*inp.x_c)-1)); + requires \valid_read(kernel.w + (0..(kernel.w_h*kernel.w_w*kernel.w_c_in*kernel.w_c_out)-1)); + requires \valid_read(bias.b + (0..bias.b_c-1)); + requires \valid_read(attr.stride+(0..1)); + requires \valid_read(attr.pads+(0..3)); + requires \valid_read(attr.dilations+(0..1)); + requires inp.x_c == out.y_c; + requires out.y_c == kernel.w_c_in; + requires kernel.w_c_in == bias.b_c; + requires out.y_h == ((inp.x_h + attr.pads[0] + attr.pads[2] - (attr.dilations[0] * kernel.w_h )) / attr.stride[0]) + 1; + requires out.y_w == ((inp.x_w + attr.pads[1] + attr.pads[3] - (attr.dilations[1] * kernel.w_w )) / attr.stride[1]) + 1; + requires inp.x_h > 0 && inp.x_w > 0 && inp.x_c > 0 && inp.x_b > 0; + requires kernel.w_h > 0 && kernel.w_w > 0 && kernel.w_c_in > 0 && kernel.w_c_out > 0; + requires bias.b_c > 0; + requires out.y_h > 0 && out.y_w > 0 && out.y_c > 0 && out.y_b > 0; + requires inp.x_h >= kernel.w_h; + requires inp.x_w >= kernel.w_w; + requires \forall integer i; 0 <= i < 4 ==> attr.pads[i] >= 0; + requires \forall integer i; 0 <= i < 2 ==> attr.dilations[i] >= 1; + requires \forall integer i; 0 <= i < 2 ==> attr.stride[i] >= 1; + + + assigns out.y[0..(out.y_b * out.y_c * out.y_h * out.y_w)-1]; + + ensures \forall integer bi, ci, hi, wi, ci_in, ki_h, ki_w; + 0 <= bi < out.y_b ==> + 0 <= ci < out.y_c ==> + 0 <= hi < out.y_h ==> + 0 <= wi < out.y_w ==> + 0 <= ci_in + 0 <= ki_h < kernel.w_h ==> + 0 <= ki_w < kernel.w_w ==> + (0 <= hi * attr.stride[0] + ki_h * attr.dilations[0] - attr.pads[0]) && + (hi * attr.stride[0] + ki_h * attr.dilations[0] - attr.pads[0] < inp.x_h) && + (0 <= wi * attr.stride[1] + ki_w * attr.dilations[1] - attr.pads[1]) && + (wi * attr.stride[1] + ki_w * attr.dilations[1] - attr.pads[1] < inp.x_w) ==> + out.y[bi * (out.y_c * out.y_h * out.y_w) + ci * (out.y_h * out.y_w) + hi * out.y_w + wi] == inp.x[bi * (inp.x_c * inp.x_h * inp.x_w) + ci_in * (inp.x_h * inp.x_w) + (hi * attr.stride[0] + ki_h * attr.dilations[0] - attr.pads[0]) * inp.x_w + ( wi * attr.stride[1] + ki_w * attr.dilations[1] - attr.pads[1])] * kernel.w[ci * (kernel.w_c_in * kernel.w_h * kernel.w_w) + ci_in * (kernel.w_h * kernel.w_w) + ki_h * kernel.w_w + ki_w] + bias.b[ci]; + + + + + + + + */ +float* conv(input_tensor inp, convolution_kernel kernel, bias_tensor bias, attributes attr, output_tensor out) { + + out.y_h = ((inp.x_h + attr.pads[0] + attr.pads[2] - (attr.dilations[0] * kernel.w_h )) / attr.stride[0]) + 1; + out.y_w = ( (inp.x_w + attr.pads[1] + attr.pads[3] - (attr.dilations[1] * kernel.w_w )) / attr.stride[1]) +1; + int y_size = out.y_b * out.y_c * out.y_h * out.y_w; + out.y = (float *)malloc(y_size * sizeof(float)); + + if (out.y == NULL) { + fprintf(stderr, "Memory allocation failed\n"); + exit(EXIT_FAILURE); + } + + // Compute padding + // compute_pad(attr.auto_pad, attr.pads, attr.stride, inp.x_h, inp.x_w, kernel.w_h, kernel.w_w, out.y_h, out.y_w, attr.pads); + + // Initialize result tensor to bias values + for (int bi = 0; bi < out.y_b; ++bi) { + for (int ci = 0; ci < out.y_c; ++ci) { + for (int hi = 0; hi < out.y_h; ++hi) { + for (int wi = 0; wi < out.y_w; ++wi) { + int y_idx = bi * (out.y_c * out.y_h * out.y_w) + ci * (out.y_h * out.y_w) + hi * out.y_w + wi; + out.y[y_idx] = bias.b[ci]; + } + } + } + } + + // Convolution computation + for (int bi = 0; bi < out.y_b; ++bi) { + for (int ci = 0; ci < out.y_c; ++ci) { + for (int hi = 0; hi < out.y_h; ++hi) { + for (int wi = 0; wi < out.y_w; ++wi) { + int y_idx = bi * (out.y_c * out.y_h * out.y_w) + ci * (out.y_h * out.y_w) + hi * out.y_w + wi; + + for (int ci_in = 0; ci_in < kernel.w_c_in; ++ci_in) { + for (int ki_h = 0; ki_h < kernel.w_h; ++ki_h) { + for (int ki_w = 0; ki_w < kernel.w_w; ++ki_w) { + int x_h_idx = hi * attr.stride[0] + ki_h * attr.dilations[0] - attr.pads[0]; + int x_w_idx = wi * attr.stride[1] + ki_w * attr.dilations[1] - attr.pads[1]; + + if (x_h_idx >= 0 && x_h_idx < inp.x_h && x_w_idx >= 0 && x_w_idx < inp.x_w) { + int x_idx = bi * (inp.x_c * inp.x_h * inp.x_w) + ci_in * (inp.x_h * inp.x_w) + x_h_idx * inp.x_w + x_w_idx; + int w_idx = ci * (kernel.w_c_in * kernel.w_h * kernel.w_w) + ci_in * (kernel.w_h * kernel.w_w) + ki_h * kernel.w_w + ki_w; + + if (x_idx < inp.x_h * inp.x_w * inp.x_c * inp.x_b && w_idx < kernel.w_h * kernel.w_w * kernel.w_c_in * kernel.w_c_out) { + out.y[y_idx] += inp.x[x_idx] * kernel.w[w_idx]; + } + } + } + } + } + } + } + } + } + + return out.y; +} +``` + +# Graph execution semantics + +
+ +[NOTE:] Elements of the execution semantics is given on the [IR (Intermediate +Representation) page](https://onnx.ai/onnx/repo-docs/IR.html) of the +ONNX web site. In addition, a Python “reference implementation” is also +provided (see ). The source +code of this implementation can be found at +. + +Very informally, the semantics is pretty simple: each operator (or +function) is called according to its position in the topological sorting +of the operators. The topological order is a partial order that ensures +that an operator is executed only when its inputs are available. Being a +partial order, it means that several valid orders exist for a given +graph. Normally (?) each order should generate the same result, even in +the presence of floating point operations. + +The Python code to execute a graph is given in class +[`ReferenceEvaluator`](https://github.com/onnx/onnx/blob/main/onnx/reference/reference_evaluator.py)). +After having processed the inputs and the initializers (i.e., fed the +`results` dictorionary with these data), the nodes are executed in +sequence. For each operator, the interpretor checks that its inputs are +in the `results` dictionary. If they are not, an error is raised (if the +operators are listed in topological order, this situation should not +occur). Otherwise, the operator is simply executed (method `run`) with +or without a context (composed of the current results) depending on the +type of operators. (Check that this does not create a dependency to the +total order of operators.) + +
+ +### Informal specification + +
+ +[NOTE:] The semantics of an ONNX model is given in Section "Model Semantics" of +the [Intermediate +Representation](https://github.com/onnx/onnx/blob/main/docs/IR.md) page. +Basically, an inference-model is a stateless function (except possibly +for some specific nodes such as a random-generation node) represented by +an acyclic `graph` of nodes. The `graph` is mainly represented by a set +of inputs and outputs and a topologically sorted list of nodes. Each +node represents a call to an operator or a function. A `function` is +itself a graph. + +Note that the types of inputs and outputs are not systematically +required because they can be inferred. In our case, I guess that we will +forbib shape inference and rely on static tensor shapes (or, at least, +shape inference can be bone before serializing the model). The proecss +of shape inference is described in Section  [ONNX Shape +Inference](https://onnx.ai/onnx/repo-docs/ShapeInference.html). + +
+ +### Formal specification + +*To be completed.* + +[^1]: At least in a first phase... + +[^2]: Note: in the ONNX runtime implementation, the padding value may be + negative, which corresponds to reducing the size of the tensor. + +[^3]: See [Why3 documentation](https://www.why3.org/) + +[^4]: See [Frama-C + documentation](https://www.frama-c.com/html/documentation.html) diff --git a/safety-related-profile/documents/conv_specification_example/imgs/README.md b/safety-related-profile/documents/conv_specification_example/imgs/README.md new file mode 100644 index 00000000..4a389a0a --- /dev/null +++ b/safety-related-profile/documents/conv_specification_example/imgs/README.md @@ -0,0 +1 @@ +Images used in the CONV specification example diff --git a/safety-related-profile/documents/conv_specification_example/imgs/autopad.png b/safety-related-profile/documents/conv_specification_example/imgs/autopad.png new file mode 100644 index 00000000..614f6942 Binary files /dev/null and b/safety-related-profile/documents/conv_specification_example/imgs/autopad.png differ diff --git a/safety-related-profile/documents/conv_specification_example/imgs/conv.png b/safety-related-profile/documents/conv_specification_example/imgs/conv.png new file mode 100644 index 00000000..c9a13486 Binary files /dev/null and b/safety-related-profile/documents/conv_specification_example/imgs/conv.png differ diff --git a/safety-related-profile/documents/conv_specification_example/imgs/conv_stride.png b/safety-related-profile/documents/conv_specification_example/imgs/conv_stride.png new file mode 100644 index 00000000..b8172b2c Binary files /dev/null and b/safety-related-profile/documents/conv_specification_example/imgs/conv_stride.png differ diff --git a/safety-related-profile/documents/conv_specification_example/imgs/dilation.png b/safety-related-profile/documents/conv_specification_example/imgs/dilation.png new file mode 100644 index 00000000..69a332c8 Binary files /dev/null and b/safety-related-profile/documents/conv_specification_example/imgs/dilation.png differ diff --git a/safety-related-profile/documents/conv_specification_example/imgs/grouped_convolution.png b/safety-related-profile/documents/conv_specification_example/imgs/grouped_convolution.png new file mode 100644 index 00000000..dfdf868c Binary files /dev/null and b/safety-related-profile/documents/conv_specification_example/imgs/grouped_convolution.png differ diff --git a/safety-related-profile/documents/conv_specification_example/imgs/pad.png b/safety-related-profile/documents/conv_specification_example/imgs/pad.png new file mode 100644 index 00000000..6abaccb4 Binary files /dev/null and b/safety-related-profile/documents/conv_specification_example/imgs/pad.png differ diff --git a/safety-related-profile/documents/conv_specification_example/imgs/why3_workflow.png b/safety-related-profile/documents/conv_specification_example/imgs/why3_workflow.png new file mode 100644 index 00000000..97ceb89e Binary files /dev/null and b/safety-related-profile/documents/conv_specification_example/imgs/why3_workflow.png differ diff --git a/safety-related-profile/documents/issues.md b/safety-related-profile/documents/issues.md new file mode 100644 index 00000000..efd381f6 --- /dev/null +++ b/safety-related-profile/documents/issues.md @@ -0,0 +1,232 @@ +# ONNX analysis + +## Issue #TEMPLATE: Title of the issue + +- CAT: category in {__OPERATOR__ , __GRAPH__ , __FORMAT__} +- CRIT: criticality in { __LOW__ , __HIGH__ } +- REQ: _Identification of the SONNX requirement that can't be satisfied due to this issue_ +- LOC: __Identification of the exact location in the standard, possibly using an hyperlink__ + +### Issue +__Description of the issue (in a synthetic way)__ + +### Consequence +__Brief description of the effect the issue on the development activities.__ + +### Proposal +__Proposal to solve the issue or mitigate the consequences__ + +### Remarks (opt) +__Additional remarks__ + + +## Issue #1: incomplete specification of CONV operator +- CAT: Operator +- CRI: HIGH +- REQ: (TBC) +- LOC: [CONV operator](https://onnx.ai/onnx/operators/onnx__Conv.html), but this issue appear in other operators + +### Issue +The description of the CONV operator is very abstract: "The convolution operator consumes an input tensor and a filter, and computes the output.". + +The value of the padding is not defined (it is actually 0). + +Presentation of attributes makes it difficult to check if all dependencies are expressed. + +### Consequence +Implementer needs to check the referece implementation (or other doc.) to understand what needs to be implemented. Different implementation may lead to different results. + +### Proposal +See [example](https://github.com/ericjenn/working-groups/tree/ericjenn-srpwg-wg1/safety-related-profile/documents/conv_specification_example) + +## Issue #2: Execution order of graph nodes +- CAT: GRAPH +- CRI: LOW +- REQ: (TBC) +- LOC: [Open Neural Network Exchange Intermediate Representation (ONNX IR) Specification](https://github.com/onnx/onnx/blob/main/docs/IR.md) + +### Issue +The ONNX specification states that "Each computation dataflow graph is structured as a topologically sorted list of nodes that form a graph, which MUST be free of cycles. [...] ". The topological order is a partial order. + +### Consequence +Different implementations may execute nodes in different orders, leading to different results when computations are done using floating poit numbers. + +### Proposal +The SONNX standard should provide a means to impose a total ordering on the nodes. + +### Remarks +This constraint will prevent optimisations. +Note that nothing prevents a model to be ill-formed. Compliance with the syntax and semantics of the ONNX standard must be checked (it is certainly checked, but nothing is said about what is checked or not and whether these checkers are complete / correct or not). + +Other constraints are given in the [onnx-ml.proto3](https://github.com/onnx/onnx/blob/main/onnx/onnx-ml.proto3). E.g.: + + // One FunctionProto can reference other FunctionProto in the model, however, recursive reference + // is not allowed. + +## Issue #3: Overloading + +- CAT: (to be completed) +- CRIT: (to be completed) +- REQ: (to be completed) +- LOC: ONNX file format definition ([onnx-ml.proto3](https://github.com/onnx/onnx/blob/main/onnx/onnx-ml.proto3)) + +### Issue +A [function in ONNX](https://onnx.ai/onnx/intro/concepts.html) is a way to reuse the same combination of operators at different locations in a model. A function may refer to operators that are in a different opset than the model itself. In that case, the standard leaves the runtimes the freedom to chose whether the local + + // The (domain, name, overload) tuple must be unique across the function protos in this list. + // In case of any conflicts the behavior (whether the model local functions are given higher priority, + // or standard operator sets are given higher priotity or this is treated as error) is defined by + // the runtimes. + +### Consequence +(TBC) + +### Proposal +(TBC) + +### Remarks (opt) +(TBC) + + +## Issue #4: opset resolution + +- CAT: (to be completed) +- CRIT: (to be completed) +- REQ: (to be completed) +- LOC: ONNX file format definition ([onnx-ml.proto3](https://github.com/onnx/onnx/blob/main/onnx/onnx-ml.proto3)) + +### Issue +A [function in ONNX](https://onnx.ai/onnx/intro/concepts.html) is a way to reuse the same combination of operators at different locations in a model. A function may refer to operators that are in a different opset than the model itself. In that case, the standard leaves the runtimes the freedom to chose whether the local + + // The (domain, name, overload) tuple must be unique across the function protos in this list. + // In case of any conflicts the behavior (whether the model local functions are given higher priority, + // or standard operator sets are given higher priotity or this is treated as error) is defined by + // the runtimes. + +### Consequence +(TBC) + +### Proposal +(TBC) + +### Remarks +As per the ONNX IR documentation: "each function contained in a Model (also referred to as a model-local function) serves as a default or fallback implementation of the corresponding operator. A runtime, however, may choose to use an alternative implementation of the operator (usually as an optimized kernel). As such, the unique name of a function is significant as it is implicitly associated with a semantic specification." + +From opset10 onwards, a function is uniquely identified by the triple (domain, name, overload). + + +## Issue #4: Function polymorphism + +- CAT: (TBC) +- CRI: (TBC) +- REQ: (TBC) +- LOC: [Shape inference in ONNX](https://onnx.ai/onnx/api/shape_inference.html), this also covers type inference. + +### Issue +Function tensor input and output data type and shape are not specified by the ONNX format. Actual types are obtained using type inference. + +### Consequence +The actual types and shapes are not explicit in the model: they can only be knwon once the input tensors are known. + +### Proposal +Enforce that all tensor data types and shapes are explicit in the model. + +### Remarks (opt) + +## Issue #5: Shape broadcasting + +- CAT: (TBC) +- CRI: (TBC) +- REQ: (TBC) +- LOC: [Broadcasting in ONNX](https://github.com/onnx/onnx/blob/main/docs/Broadcasting.md) + +### Issue +The broadcasting logic is insufficiently specified on the ONNX documentation. + +### Consequence +(TBC) + +### Proposal +(TBC) + +### Remarks +About broadcasting, see the [dedicated section of the NumPy manual ](https://numpy.org/doc/stable/user/basics.broadcasting.html#general-broadcasting-rules) + +## Issue #6: Overloading + +- CAT: (TBC) +- CRI: (TBC) +- REQ: (TBC) +- LOC: See section on Functions in [Open Neural Network Exchange Intermediate Representation (ONNX IR) Specification](https://github.com/onnx/onnx/blob/main/docs/IR.md) + +### Issue +IR version 10 introduces overloading, i.e. the capability to have several definitions for the same function, and select them using a new ‘overloading’ field. + + +### Consequence +(TBC) + +### Proposal +(TBC) + + +## Issue #7: Variadic inputs + +- CAT: (TBC) +- CRI: (TBC) +- REQ: (TBC) +- LOC: + +### Issue +Variadic operators can accept any number of inputs. + + +### Consequence +(TBC) + +### Proposal +Introduce a maximal number of parameters. + +### Remarks + +## Issue #8: Dynamic (Node input) vs static (Node attribute) + +- CAT: (TBC) +- CRI: (TBC) +- REQ: (TBC) +- LOC: + +### Issue +The semantics is not clear that Node input is dynamic and Node attribute is static. +As attributes can take Tensor values, these values might come from other Nodes (constant or not) + + + +### Consequence +(TBC) + +### Proposal +Introduce a maximal number of parameters. + +### Remarks + + +## Issue #9: Data storage + +- CAT: (TBC) +- CRI: LOW +- REQ: (TBC) +- LOC: [onnx-ml.proto3](https://github.com/onnx/onnx/blob/main/onnx/onnx-ml.proto3#L611) + +### Issue + +### Consequence +(TBC) + +### Proposal + +### Remarks + +ONNX supports __typed__ and __raw__ serialization of data. When __raw_data__ are used, the standard specifies that data must be stored in as "fixed-width, little endian order" with other specific constraints for float64, complex64, etc. data. + +Do we need to specify our own encoding format? diff --git a/safety-related-profile/documents/sow.md b/safety-related-profile/documents/sow.md new file mode 100644 index 00000000..2085ad0f --- /dev/null +++ b/safety-related-profile/documents/sow.md @@ -0,0 +1,396 @@ +# Introduction + +## Terms + +The following terms are used in the document: + +- **Trained ML model**: the conceptual structure of graphs and + operators that maps a set of input tensors to a set of output + tensors. The Trained ML Model Description (TMD) is represented using + the *Trained Model Description Language* (TMDL). + +- **Trained ML Model Description** (TMD: a concrete representation of + the conceptual trained ML Model that can be interpreted by a human + or a program. + +- **Trained ML Model Description** **Language** (TMDL): The language + used to represent a *Trained ML Model* (TMD). + +## Contents + +This document gives a first definition of the activities to be carried +out by the ONNX safety-related profile workgroup. This "workplan" is +aimed at being presented *and discussed* during the Workgroup kick-off +meeting planned for the end of September. + +## Main objectives + +Provide a definition of the formalism used to represent a trained ML +model\... + +- *\... that has an understandable and non ambiguous syntax and + semantics.* + + The description of the ML model expressed using this formalism must + be a Low level Requirement for the implementation phase in the sense + that its interpretation and implementation shall not require any + further information that the one given by the description of the ML + model. + +- *\... that allows multiple levels of accuracy and precision for the + description of a given model.* + + The language used to describe the model (i.e., its syntax and + semantics) must be non ambiguous, but a model may be ambiguous if + this ambiguity is acceptable or even necessary to leave some freedom + to the implementer (e.g., for optimization). The objective is to + identify, control, and possibly remove this ambiguity by an + appropriate refinement of the ML model description. + +# Workgroup Activities + +This section presents the different activities of the workgroup. Their +dependencies are expressed via their inputs/outputs. + +### A1 - Elicitation of industrial needs and requirements + +#### Objectives + +- Elicit end-users needs related to the ONNX format, i.e., What are + the activities using ONNX models?, What are the evidences required + by certification authorities that involve the ONNX model[^1]?, How + do the ONNX model impact these activities?, + +- Elicit requirements applicable to the ONNX standard to satisfy the + end-users needs. Those requirements shall cover all aspects of the + standard, including documentation, graphs and operators semantics, + file format, reference implementation, etc. + +#### Rationales {#rationales .unnumbered} + +Clarify the expectation of end-users. Ensure that the requirements for +ONNX are traceable to one or several end-users' needs. + +#### Inputs + +- Certification standards (e.g., ARP6983, ISO/DPAS 8800, + ECSS-E-HB-40-02A DIR1, etc.) + +- Company-specific requirements + +#### Outputs + +- D1.a.\: End users needs and requirements for domain \. + +#### Detailed activities + +The activities defined below are per domain. + +##### End-Users Needs Elicitation + +- UNAct1: Definition of the *Trained Model Description (TMD)* artefact + (e.g., the Machine Learning Model Description (MLMD) in ARP6983) + +- UNAct2: Description (overview) of the machine learning development + process + +- UNAct3: Description of the development process objectives and + activities that: + - Produce the TMD + - Take the TMD as input + +- UNAct4: Description of the development process verification + objectives and activities that apply to the TMD + +- UNAct5: Constraints on the TDM, that come from: + - the Development and verification activities + - the Industrial context + +- UNAct6: Expression of the needs + +##### *ONNX Requirements Expression* + +The activities below take the end-user needs as inputs + +- ORAct1: Definition of the list of the aspects to which the + requirements for a safety-related ONNX profile will apply, e.g., + - Semantics of the operators + - Semantics of the graph + - Data types + - Metamodel + - Concrete syntax (format) + - Documentation + - Traceability + - Versioning + - etc. + +- ORAct2: For each aspect of the list, definition of the requirements + *Examples of requirements that may be expressed:* + - The semantics of the Trained Model Description Language (TMDL) used + to describe the TMD shall be defined both informally (for + documentation purposes) and formally using a mathematically-grounded + language. This covers all that is needed for tooled and/or human + interpretation of any valid TMD described using the TMDL (including, + (e.g., operators and graphs). + - The formal definition of the TMDL shall define precisely and + accurately the expected results of the interpretation of any valid + TMDL model. The level of precision and accuracy may be a parameter + of the description of the semantics. + - A reference implementation shall be provided for each operator. The + reference implementation shall be accompanied with all the necessary + information describing the execution environment used to validate + compliance with the formal specification. + - In the TMD, it should be possible to indicate the meaning of each + dimension of the tensors + +### A2: Consolidate Requirements for the ONNX profile + +#### Objectives + +- Consolidate, filter, and prioritize the requirements identified for + the different industrial domains in D1.a.\. + +- Discriminate requirements aimed at the preservation of the model + semantics from requirements aimed at facilitating / supporting other + development assurance activities. + +#### Rationale + +The ONNX Safety-related profile must be unique whereas the needs comes +from different industrial domains, referring different certification +standards. This activity is aimed at defining a consensual and +consistent set of requirements. + +#### Inputs + +- D1.a.\: End users needs and requirements for domain \. + +#### Outputs + +- (D2.a) ONNX safety-related Profile Requirements + +#### Detailed activities + +##### Consolidation of requirements + +- CRAct1: Consolidation and fusion of semantically equivalent + requirements + +- CRAct2: Grouping and prioritization of requirements + +### A3: Definition of the Scope of the ONNX Safety related profile + +#### Objectives + +- Selection of the set of operators and constructs to be considered in + the Safety-related profile. + +#### Rationale + +In order to keep the effort reasonable and maximize or chance to produce +useful results within a reasonable time frame, we propose to work on a +restricted set of operators and constructs. This set will be defined +according to the actual needs of the end-users (i.e., the models they +want to implement). + +#### Inputs + +- End user use cases + +#### Outputs + +- (D3.a) Safety-related Profile Scope Definition + +#### Detailed activities + +##### Definition of the Safety-related Standard Scope + +- DSCAct1: Identification/definition of the Safety-related industrial + use case reference models + +- DSCAct2: Extraction of the operators and constructs from the + Safety-related industrial use cases reference models + +- DSCAct3: Consolidation of the TMDL operators and constructs for the + Safety-related profile, from the reference models possibly augmented + with necessary additional operators and constructs. + +### A4: Analysis of the ONNX standard + +#### Objectives + +- Identify the parts of the standard that need to be updated / + clarified / modified in order to comply with the Safety-related + Profile Requirements defined in D2.a, for the subset of the standard + identified in D3.a. + +#### Rationales + +Once the requirements for the format are defined, the work consists to +find what needs to be described, improved, fixed,\... in the existing +ONNX standard. In particular, all elements that are unclear or which +interpretation is left to the implementer shall be spotted, analysed, +discussed, and a proposal for clarification/correction may be proposed +if required. These proposals may be applicable to a whole part of the +standard e.g. a recommendation may concern the documentation of all +operators). + +#### Inputs + +- (D2.a) Requirements applicable to the ONNX profile + +- (D3.a) Safety-related Profile Scope Definition + +#### Outputs + +- (D4.a) ONNX Analysis and Recommendations for the Safety-related + Profile + +#### Detailed activities + +##### Analysis of the ONNX standard + +- AnaAct1: Analysis of the compliance of the ONNX standard with + respect to each of the requirements and identification of + non-compliances. + +- AnaAct2: Provision of recommendations, solutions, guidance to modify + the ONNX standard. + +### A5: Elaboration of the specification guidelines + +#### Objectives + +State of the Art and proposal of guidelines for the specification of the +graph and operators to comply with D4.a. + +#### Rationales + +Various approaches and notations may be used to specify the graph and +operators in a formal way. This activity is aimed at proposing a +solution acceptable with respect to the end-users needs and +requirements. + +#### Inputs + +- (D2.a) Requirements applicable to the ONNX profile + +#### Outputs + +- (D5.a) Specification Guidelines + +#### Detailed activities + +##### Prototype guidelines + +- ProAct1: Elaborate a first set of (informal + formal) specification + guidelines and apply them on a few operators (e.g., conv) and + constructs in order to discussed and reviewed by the workgroup + +##### Elaborate guidelines + +- • ElaAct1: Elaborate the final set of guidelines (including + notation, presentation of the specification, etc. to ensure a + consistent presentation of the specification) + +### A6: Development of the ONNX Safety-related profile - semantics + +#### Objectives + +Development of the ONNX Safety-related profile semantics to address +issues identified in (D4.a), using the formalism and approach defined in +D5.a + +#### Inputs + +- (D4.a) ONNX Analysis and Recommendations for the Safety-related + Profile + +- (D5.a) Formal Specification Guidelines + +#### Outputs + +- (D6.a) ONNX Safety-related profile (graph execution part) + +- (D6.b) ONNX Safety-related profile (operators part) + +- (D6.c) ONNX Safety-related profile reference implementation + +Detailed activities *To be completed.* + +### A7: Development of the ONNX Satefy-related profile - format + +#### Objectives + +Development of the ONNX Safety-related exchange format in compliance +with the recommendations given in (D4.a). + +#### Inputs + +- (D4.a) ONNX Analysis for the Safety-related Profile + +- ONNX standard + +#### Outputs + +- (D7.a) ONNX Safety-related profile format + +#### Detailed activities + +*To be completed.* + +### A8: Verification of the ONNX Safety-related profile + +#### Objectives + +Verification of the ONNX Safety-related profile vs the requirements +expressed in (D2.a), the recommendations identified in (D4.a) and the +guidelines defined in (D5.a) + +#### Inputs + +- (D4.a) ONNX Analysis and Recommendations for the Safety-related + Profile + +- (D5.a) Formal Specification Guidelines + +- (D6.a) ONNX Safety-related profile (graph execution part) + +- (D6.b) ONNX Safety-related profile (operators part) + +- (D6.c) ONNX Safety-related profile reference implementation + +- (D7.a) ONNX Safety-related profile format + +#### Outputs + +- (D8.a) ONNX Safety-related profile verification report + +Detailed activities *To be completed.* + +### A9: Validation of the ONNX Safety-related profile + +#### Objectives + +Validation of the ONNX Safety-related profile vs the End-users needs +expressed in the various per domain D1.a.\ documents. + +#### Inputs + +- D1.a.\: End users needs and requirements for domain \. + +- (D6.a) ONNX Safety-related profile (graph execution part) + +- (D6.b) ONNX Safety-related profile (operators part) + +- (D6.c) ONNX Safety-related profile reference implementation + +- (D7.a) ONNX Safety-related profile format + +#### Outputs + +- (D9.a) ONNX Safety-related profile validation report + +Detailed activities *To be completed.* + +[^1]: E.g., those concerning the MLMD in the ARP6983/ED-324. diff --git a/safety-related-profile/meetings/2024-08-21.md b/safety-related-profile/meetings/2024-08-21.md new file mode 100644 index 00000000..24e42700 --- /dev/null +++ b/safety-related-profile/meetings/2024-08-21.md @@ -0,0 +1,68 @@ +# SONNX meeting (21/08) + +## Object + +- ONNX Safety-related workgroup meeting + +## Attendees + +- Mariem, Nicolas, Augustin, Eric + +## Minutes + +- Status of actions (see below) +- Brief recap of the work done on the CONV operator (Mariem) + - The Why3 specification is now complete thanks to Loïc. A FrameC implementation is also available. + + + - [ ] <21/08> (Mariem) Convert the CONV operator specification to markdown. + - After discussion with Loïc, the appropriate solution would be first to “axiomatize” tensors (in the same way it has already been done for matrices). + + + - [ ] <21/08> (Mariem, eric, Loïc) Propose an first axiomatization of tensors (3D, 4D) and use it to specify the CONV operator. +- Discussion of Nicolas’ comments on the SOW + - Four points were discussed + - We have introduced new terms (”TMD, TMDL) that are neither used in the ONNX community (where the term “ONNX IR” is used) nor in the ARP (where the term “MLMD” is used). We have either to stick to one of those — possibly ONNX’ — or at least explain the correspondence between them… + - Nicolas mentioned that we have to “address the semantic of the model storage (ONNX IR for safety critical) […]”. + - Yes, this should be covered by the activities concerning the semantics and the syntax of the model description language. Nicolas also raised the question about discriminating the model used during the design and the model used as the starting point of the implementation. *But it is not clear whether we really need to discriminate them: we are only concerned by the semantics of the model, whatever its used.* However, we will not consider operators used during training, so our profile will only be applicable during inference… + - Nicolas what not happy with the remark about “ambiguous” models (sentence “[…] ambiguous if this ambiguity is acceptable or even necessary to leave some freedom to the implementer […]): *I do not see any reason for that. The model shall be unambiguous. If the implementer chooses to implement a surrogate model (optimized), then this model is the MLMD to be specified using ONNX. ARP6983 specifies that replication criteria shall be specified*. + - This is an important point. The model shall actually be non ambiguous. The user may introduce derived requirements about implementation (e.g., the order in which operators have to be executed). This contingency is what was meant by the term “ambiguity”. We have to be more explicit in the SoW: the specification will not be ambiguous, but there will be some “flexibility” about the set of derived requirements applicable to a given model. + + - [ ] <21/08> (Eric+Jeans) In the SoW, avoid the use of the term “ambiguity”. Be more explicit about the different types of requirements. Introduce the concept of “replication criteria” and clarify its relation to requirements. + - Nicolas mentioned that “All the activities consist in paper work. Is there some room for POC for formal verification, extension of the definition of the format, tools to build, review, verify a SONNX MLMD ? Who are the users/recipients of the outputs ? (I would guess that some output are for us, some others for ONNX community...).” + - Only the first part of the remark was discussed: we actually need some “practical” work in order to validate our proposal. We should introduce a “running example” (e.g., MobileNet) that will be specified using SONNX, implemented, compared with a reference implementation, etc. Those “practical” implementation activities have to be described in the SoW too. + + + - [ ] <21/08> (Eric+Jeans) Add to the SoW activities related to the validation of the WG results through its application to a practical use case +- Discussion about polymorphic operators: *we have to provide a specific specification for every types supported by an operator,* i.e., there will be a specification of the CONV operator for int8, float16, etc. This is in particular necessary for ints for which saturation is necessary. We should do the exercise on the CONV operator. + + + - [ ] <21/08> (Mariem, Loïc, Eric, Augustin) Specify the CONV operator for int8 and (e.g., ) float16…. +- We have to specify the broadcasting rules (see [https://github.com/onnx/onnx/blob/main/docs/Broadcasting.md](https://github.com/onnx/onnx/blob/main/docs/Broadcasting.md)). +- Logistics + - All documents should be in the same place. At the moment, some documents are on the IRT’s sharepoint while some other at in the ONNX github. Office documents where placed in the Sharepoint for they were directly editable by some users. For other (e.g., Airbus), Sharepoint documents need to be downloaded first, which make Sharepoint pointless. + - [X] <21/08> (Eric) Convert the SOW to markdown, move all documents to github + - Documents have moved [here](https://github.com/ericjenn/working-groups/tree/ericjenn-srpwg-wg1/safety-related-profile/documents). + - The Sharepoint has moved in order to match the new project name, which is “SONNX”. The URL is now : [https://extranet.irt-saintexupery.com/Extranet/Projets/SONNX/Pages/default.aspx](https://extranet.irt-saintexupery.com/Extranet/Projets/SONNX/Pages/default.aspx) + - The github repository is accessible at [https://github.com/ericjenn/working-groups/tree/ericjenn-srpwg-wg1/safety-related-profile](https://github.com/ericjenn/working-groups/tree/ericjenn-srpwg-wg1/safety-related-profile). + + - [ ] <21/08> (eric+Jean) Clarifies the use of github… + + +## Actions + +- [x] (eric) Put the examples of mails (in French and English) to be used for dissemination purposes. + +- [x] (eric) Propose a poll for the project’s name, (all) answer the poll + - (Updated 21/08) SONNX + +- [ ] (eric+jean) Propose an agenda for the KOM + +- [ ] (all) Read the SOW and provide your feedback ) +- (Updated 21/08) Comments by Nicolas, see doc. + +- [x] (Nicolas) Collect the analysis done so far and put it in one document (on sharepoint or on the ONNX git, it’s up to you) + - Done, the document is on the sharepoint ([https://extranet.irt-saintexupery.com/Extranet/Projets/SONNX/Work/_layouts/15/WopiFrame.aspx?sourcedoc=/Extranet/Projets/SONNX/Work/Documents partages/SoW/ONNX format comments v0.1.docx&action=default](https://extranet.irt-saintexupery.com/Extranet/Projets/SONNX/Work/_layouts/15/WopiFrame.aspx?sourcedoc=/Extranet/Projets/SONNX/Work/Documents%20partages/SoW/ONNX%20format%20comments%20v0.1.docx&action=default)”) + +- [ ] <10/07>(eric+jean) Organize some meetings as “advisory boards” +- [x] <21/08> (Eric) Convert the SOW to markdown diff --git a/safety-related-profile/scratchpad/onnx_arp.md b/safety-related-profile/scratchpad/onnx_arp.md new file mode 100644 index 00000000..0b1d73dd --- /dev/null +++ b/safety-related-profile/scratchpad/onnx_arp.md @@ -0,0 +1,96 @@ + +drawing + +## Glossary: +- MLMD: Machine Learning Model Description +- MLMID: Machine Learning model Implementation Description + +## Definitions: +- ML MODEL: Model characterized by parameters determined by a data-driven training process to perform one or more +specific functions. +- ML constituent: An AI/ML Constituent is a collection of hardware and/or software items (including the necessary pre- and post-processing elements), and at least one specialized hardware or software item containing one (or several) ML Model(s). (see ARP, p.7). + +## Elements: +- The MLDL is the input of the ML items implementation process. + - The (MLMD, ML constituent requirements, test dataset) are the inputs of the implementation phase (see ARP, Figure 20). + - The MLMD only covers processings expressed using the ML graph. (pre- and post-processing are done by non-ML contituents) + +_There is no clear definition of the MLMD besides the implicit one for which the MLMD is "the decription of the ML model that is necessary to implement the inference process". Note that in the glossary, the entry for MLMD is "TBD"_ + +- Verification activities of the MLMD are listed in §6.8.1.5: + - ML Model Description is complete, correct, and consistent. + - ML Model Description is traceable to ML Model. <> + - ML Model additional information is complete, correct, and consistent. <> + - ML Model Description, ML Data processing description, and ML Model additional information are consistent. + +_There is no constraints about the performance "achieved by" the MLMD (i.e., the performance that would be achieved by some perfect implementation of the MLMD. The only verification objective relating the ML model and the MLMD concerns "traceability" and "consistency"._ + +_Nothing is said about the evaluation of the performance of the ML model. This evaluation may depend on the execution target (a desktop PC, a desktop PC with a GPU, a computer in the cloud) and framework. If the actual (measured) performance of the ML model (during the design phase) is a requirement, the measurement process should be defined precisely._ + +- The SONNX standard defines the language in which the MLMD is expressed. + - SONNX is an "extension" of the ONNX standard. + - ONNX has ** not ** been targeted towards safety critical applications. + - SONNX is aimed at completing the ONNX standard to address concerns of safety-related systems. + - SONNX shall define the constructs (operators, graph) and the semantics of those constructs. + - A SONNX MLMD shall give a non ambiguous functional specification of the constructs. + - This definition is necessary both to ensure that developper will be able to implement the model correctly, and to + - It may also capture derrived requirements such as the explicit ordering of operators when the ordering imposed by the dataflow semantics is not unique. + - The specification of operators is purely functional. + - For a given operator, it describes what the operator shall do. + - In some cases, the specification may be operational and describe how to calculate the result. But this shall normaly not preclude other implementations as long as the computed results are identical. + - "Identity" of results may be achievable when considering mathematical abstractions such as "exact" operations performed on Real numbers. But in most cases, this cannot be achieved, even if the operational description of the algorithm is followed scruptuously. + - This means that an actual implementation may produce results that are "slightly" different from those that would be produced by a mathematically strictly exact implementation. + - The correctness criterion must be expressed by a maximal acceptable error. + - This could be part of the MLMD specification (i.e., the implementer is allowed to produce an implementation that is epsilon-off to the mathematically exact value. + - In some cases, it may be impossible to formally estimate the actual value of epsilon. in that case, epsilon may be estimated by testing and comparing the result of the implementation with the one of the reference implementation. + - But the error of the reference implementation with respect to the exact implementation must itself be estimated... + - In some cases, the implementation may be requested to striclty follow the operational specification (not in terms of results, but in terms of oeprations). + - This is a derived requirement that shall be captured explicitly + + - SONNX shall facilitate development and verification activities (possibly by introducing specific annotations) + +- "The performance of the MLMD is verified against the test dataset on the target environment" (ARP) + - To do this verification, we need to interpret (or "execute") the model for a given set of inputs. + - We have to provide everything that is necessary to do this interpretation. + - This includes: + - an ** executable specification ** of the operators, of the graph, + - a specification or a clear designation of the interpretation infrastructure (including the compiler, the libraries, if any, and target machine). + - This executable specification is what we call a "reference implementation" + - There is an existing ONNX reference implementation, but (as indicated by ONNX), it is incomplete. + - The reference implementation shall be + - simple enough to facilitate its implementation / verification. + - demonstrated to comply with the SONNX semantics + +- Several properties (stability, robustness,...) shall be verified during the MLDL part of the process (see §6 of the ARP). + - If we want to take credit of these verifications (do we?), we have to demonstrate that those properties are preserved throughout the implementation process. + - Are the element determining those properties captured by the MLMD? (if not, this means that the MLMD cannot carry the constraints about those properties). If not, what should be added? + +- A MLMD is transformed into one or several MLMIDs. + +- The MLMID shall be demonstrated to comply with the MLMD. + - This can be achieved + - by testing the MLMID on the test dataset and showing that the performances are the same as for the MLMD + - by demonstrating that the MLMID correctly implements the MLMD (by testing) + +- We have to demonstrate that the MLMID is traceable to the MLMD. This may be achieved by providing an MLMD to MLMID mapping. + - If optimizations (e.g., fusion,...) are performed during the transformation process, those transformations shall be described so as to restore traceability (i.e., prove that the preservation of the semantics). + - In the simplest case, each operator in the implementation can be traced directly to an operator in the MLMD. In the ARP, parts of the model to be mapped to a specific target is called a "ML model (logical) elements". Several "logical elements" may be deployed on the same HW target. Therefore, one MLMID may concern several logical elements. + + +- We have to demonstrate that the "combination of MLMIDs semantics reproduces the semantic of MLMD without introducing new of unexpected behaviours". + - Verification will (usually) be done using testing, which means that the MLMD must be interpretale (or "executable"). + +- Ensuring this property can be done by construction, for instance by introducing derived requirements about, e.g., the execution sequencing and verfying that these requirements are actually applied, and/or by verification. + +- The derived requirements for one MLMID are captured in the MLPADR. + +- One MLMID specifies one SW item. + +- Derived requirements may be added during the refinement process. We may want those requirements to be captured by the SONNX profile. For instance, the mapping of parts of the MLMD to targets may be captured thanks to dedicated annotations. + + + + + + +