Native Image Build Bundles

# Draft: Native Image Build Bundles

## Motivation

1. The deployment of a native image is just one step in the lifecycle of an application or service. Real world 
applications run for years and sometimes need to be updated or patched long after deployment (security fixes). It would be great if there would be an easy way to redo an image build at some point in the future as accurately as possible.

2. Another angle is provided from the development phase. If image building fails or a malfunctioning image is created (i.e. the same application runs fine when executed via JVM) we would like to get bug reports that allow us to reproduce the problem locally without hours of replicating their setup. We would want some way to bundle up what the user built (or tried to build) into a nice package that allows us to instantly reproduce the problem on our side.

3. Debugging an image created long time ago is also sometimes needed. It would be great if there is a single bundle that contains everything needed to perform this task.

## Build Bundles 

A set of options should be added to the `native-image` command that allows to create so-called "build bundles" that can
be used to help with problems described above. There shall be

```shell
native-image --bundle-create=mybundle.nib ...other native-image arguments...
```

This will instruct native-image to create build bundle `mybundle.nib` alongside the image.

For example, after the running:

```shell
native-image --bundle-create=alaunch.nib -Dlaunchermode=2 -EBUILD_ENVVAR=env42 \
 -p somewhere/on/my/drive/app.launcher.jar:/mnt/nfs/server0/mycorp.base.jar \
 -cp $HOME/ourclasses:somewhere/logging.jar:/tmp/other.jar:aux.jar \
 -m app.launcher/paw.AppLauncher alaunch
```
the user sees the following build results:
```shell
~/foo$ ls
alaunch.output alaunch.nib somewhere aux.jar
```
As we can see, in addition to the image a `alaunch.nib`-file and the `alaunch.output` directory were created. This is the native image build bundle for the image that got built and a directory for the actual image that got built and any additional files that got created as part of image building. At any time later, if the same version of GraalVM is used, the image can be rebuilt
with:

```shell
native-image --bundle-apply=.../path/to/alaunch.nib
```

this will rebuild the `alaunch` image with the same image arguments, environment variables, system properties
settings, classpath and module-path options as in the initial build.

To support the use-case of image-building-as-a-service, there should also be a way to create a bundle without
performing the initial build locally. This allows users to offload image building to a cloud-service specialized in
image building and retaining of build bundles. The command line for that should be:     

```shell
native-image --bundle-create=mybundle.nib --dry-run ...other native-image arguments...
```

## Build Bundles File Format

A `<imagename>.nib` file is a regular jar-file that contains all information needed to bundle a previous build.
For example, the `alaunch.nib` build bundle has the following inner structure:

```
alaunch.nib
├── input
│   ├── auxiliary <- Contains auxiliary files passed to native-image via arguments
│   │                (e.g. external `config-*.json` files or PGO `*.iprof`-files)
│   ├── classes <- Contains all class-path and module-path entries passed to the builder
│   │   ├── cp
│   │   │   ├── aux.jar
│   │   │   ├── logging.jar
│   │   │   ├── other.jar
│   │   │   └── ourclasses
│   │   └── p
│   │       ├── app.launcher.jar
│   │       └── mycorp.base.jar
│   └── stage
│       ├── all.env <- All environment variables used in the image build
│       ├── all.properties  <- All system properties passed to the builder
│       ├── build.cmd <- Full native-image command line (minus --bundle-create option)
│       ├── run.cmd <- Arguments to run application on java (for laucher, see below) 
│       └── container <- For containerized build this subdirs holds all info about that 
│           ├── Dockerfile <- Container image that was used to perform the build
│           ├── run.cmd <- Arguments passed to docker/podman to run the container
│           └── setup.json <- Info about the docker/podman setup that was used
│                             * Linux kernel version
│                             * Docker/podman version
│                             * CGroup v2 or CGroup v1
├── output
│   ├── debug
│   │   ├── alaunch.debug <- Native debuginfo for the built image.
│   │   └── sources <- Reachable sources needed for native debugging.
│   └── build
│       └── report <- Contains information about the build process.
│           │         When rebuilding, these will be compared against. 
│           ├── analysis_results.json
│           ├── build_artifacts.json
│           ├── build.log
│           ├── build_output.json
│           ├── jni_access_details.json
│           └── reflection_details.json
├── META-INF
│   ├── MANIFEST.MF <- Specifes nibundle/Launcher as mainclass
│   └── nibundle.properties <- Contains build bundle version info:
│                     * build bundle format version
│                     * Platform the bundle was created on (e.g. linux-amd64) 
│                     * GraalVM / Native-image version used for build
└── nibundle
    └── Launcher.class <- Launcher for running of application with `java`
                          (uses files from input directory)
```

As we can see, there are several components in a build bundle that we need to describe in more detail.

### `META-INF`:

Since the bundle is also a regular jar-file we have a `META-INF` subdirectory with the familiar `MANIFEST.MF`. The
bundle can be used like a regular jar-launcher (by running command `java -jar <imagename>.nib`) so that the
application we build an image from is instead executed on the JVM. For that purpose the `MANIFEST.MF` specifies the
`nibundle/Launcher` as main class. Is is particularly useful if you want to run the application on the JVM with the native-image agent to collect configuration data that you then integrate into the bundle as a second step.

Here we also find `nibundle.properties`. This file is specific to build bundles. Its existence makes clear that this is no
ordinary jar-file but a native image build bundle. The file contains version information of the native image build
bundle format itself and also which GraalVM version was used to create the bundle. This can later be used to report a
warning message if a bundle gets built with a GraalVM version different from the one used to create the bundle.
This file also contains information about the platform the bundle was created on (e.g. `linux-amd64` or
`darwin-aarch64`).

### `input`:

This directory contains the entire amount of information needed to redo the previous image build. The original
class-path and module-path entries are placed into corresponding files (for jar-files) and subdirectories (for
directory-based class/module-path entries) into the `input/classes/cp` (original -cp/--class-path entries) and the
`input/classes/p` (original -p/--module-path entries) folders. The `input/stage` folder contains all information
needed to replicate the previous build context.

#### `input/stage`:

Here we have `build.cmd` that contains all native-image command line options used in the previous build. Note that
**even the initial build that created the bundle already uses a class- and/or module-path that refers to the contents
of the `input/classes` folder**. This way we can guarantee that a bundle build sees exactly the same relocated
class/module-path entries as the initial build. The use of `run.cmd` is explained later.

File `all.env` contains the environment variables that we allowed the builder to see during the initial build and
`all.properties` the respective system-properties.

#### `input/stage/container`:

If the image builder runs in a container environment, this subdirectory holds all information necessary to redo the
image build later in an equivalent container environment. It contains the `Dockerfile` that was used to specify the
container image that executed the image builder. Next, `run.cmd` contains all the arguments that were passed to
docker/podman. It does not contain the arguments passed to be builder. In `setup.json` we save all information about
the container environment that was used (Linux kernel version, CGroup V1 or V2, Docker/podman version). For more info
see [below](#containerized-image-building-on-supported-platforms).

### `output`:

This folder contains all the output that was generated by the image build process (if the image was built as part of bundle creation). This contains debuginfo needed in case we need to debug the image at some point in the future.

#### `output/build`:

This folder is used to document the build process that lead to the image that was created alongside the bundle.
The `report` sub-folder holds `build.log`. It is equivalent to what would have been created if the user had appended
`|& tee build.log` to the original native-image command line. Additionally, we have several json-files:
* `analysis_results.json`: Contains the results of the static analysis. A rerun should compare the new
`analysis_results.json` file with this one and report deviations in a user-friendly way.
* `build_artifacts.json`: Contains a list of the artifacts that got created during the initial build. As before,
changes should be reported to the user. 
* `build_output.json`: Similar information as `build.log` but more structured and detailed.
* `jni_access_details.json`: Overview which methods/classes/fields have been made jni-accessible for image-runtime.
* `reflection_details.json`: Same kind of information for reflection access at image runtime.

As already mentioned a rebuild should compare its newly generated set of json-files against the one in the bundle and
report deviations from the original ones in a user-friendly way.

### `nibundle`:

Contains the `Launcher.class` that is used when the bundle is run as a regular java launcher. The class-file is not
specific to a particular bundle. Instead, the Launcher class extracts the contents of the `input` into a temporary
subdirectory in `$TEMP` and uses the files from `input/stage/all.*` and `input/stage/run.cmd` to invoke
`$JAVA_HOME/bin/java` with the environment-variables and with the arguments (e.g. system-properties) needed to run the
application on the JVM.

## Enforced sanitized image building

### Containerized image building on supported platforms

If available, docker/podman should be used to run the image builder inside a well-defined container image. **This allows
us to prevent the builder from using the network during image build**, thus guaranteeing that the image build result did
not depend on some unknown (and therefore unreproducible) network state. Another advantage is that we can mount
`input/classes` and `$GRAALVM_HOME` read-only into the container and only allow read-write access to the mounted `out`
and `build` directories. This will prevent the application code that runs at image build time to mess with anything
other than those directories. All information about containerized building is recorded in bundle subdirectory
`input/stage/container`.

### Fallback for systems without container support

If containerized builder execution is not possible we can still at least **have the builder run in a sanitized
environment variable state** and make sure that **only environment variables are visible that were explicitly
specified with `-E<env_var_name>=<env_var_value>` or `-E<env_var_name>`** (to allow passing through from the
surrounding environment).

## Handling of Image build errors

To ensure build bundles are feasible for the [second use case described above](#motivation) we have to make sure a
bundle gets successfully created even if the image build fails. Most likely in this case the `out` folder will be
missing in the bundle. But as usual `build/report/build.log` will contain all the command line output that was shown
during the image build. This also includes any error messages that resulted in the build failure.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Native Image Build Bundles #5473

Draft: Native Image Build Bundles

Motivation

Build Bundles

Build Bundles File Format

`META-INF`:

`input`:

`input/stage`:

`input/stage/container`:

`output`:

`output/build`:

`nibundle`:

Enforced sanitized image building

Containerized image building on supported platforms

Fallback for systems without container support

Handling of Image build errors

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Native Image Build Bundles #5473

Description

Draft: Native Image Build Bundles

Motivation

Build Bundles

Build Bundles File Format

META-INF:

input:

input/stage:

input/stage/container:

output:

output/build:

nibundle:

Enforced sanitized image building

Containerized image building on supported platforms

Fallback for systems without container support

Handling of Image build errors

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`META-INF`:

`input`:

`input/stage`:

`input/stage/container`:

`output`:

`output/build`:

`nibundle`: