Skip to content

Native Image Build Bundles #5473

@olpaw

Description

@olpaw

Draft: Native Image Build Bundles

Motivation

  1. The deployment of a native image is just one step in the lifecycle of an application or service. Real world
    applications run for years and sometimes need to be updated or patched long after deployment (security fixes). It would be great if there would be an easy way to redo an image build at some point in the future as accurately as possible.

  2. Another angle is provided from the development phase. If image building fails or a malfunctioning image is created (i.e. the same application runs fine when executed via JVM) we would like to get bug reports that allow us to reproduce the problem locally without hours of replicating their setup. We would want some way to bundle up what the user built (or tried to build) into a nice package that allows us to instantly reproduce the problem on our side.

  3. Debugging an image created long time ago is also sometimes needed. It would be great if there is a single bundle that contains everything needed to perform this task.

Build Bundles

A set of options should be added to the native-image command that allows to create so-called "build bundles" that can
be used to help with problems described above. There shall be

native-image --bundle-create=mybundle.nib ...other native-image arguments...

This will instruct native-image to create build bundle mybundle.nib alongside the image.

For example, after the running:

native-image --bundle-create=alaunch.nib -Dlaunchermode=2 -EBUILD_ENVVAR=env42 \
 -p somewhere/on/my/drive/app.launcher.jar:/mnt/nfs/server0/mycorp.base.jar \
 -cp $HOME/ourclasses:somewhere/logging.jar:/tmp/other.jar:aux.jar \
 -m app.launcher/paw.AppLauncher alaunch

the user sees the following build results:

~/foo$ ls
alaunch.output alaunch.nib somewhere aux.jar

As we can see, in addition to the image a alaunch.nib-file and the alaunch.output directory were created. This is the native image build bundle for the image that got built and a directory for the actual image that got built and any additional files that got created as part of image building. At any time later, if the same version of GraalVM is used, the image can be rebuilt
with:

native-image --bundle-apply=.../path/to/alaunch.nib

this will rebuild the alaunch image with the same image arguments, environment variables, system properties
settings, classpath and module-path options as in the initial build.

To support the use-case of image-building-as-a-service, there should also be a way to create a bundle without
performing the initial build locally. This allows users to offload image building to a cloud-service specialized in
image building and retaining of build bundles. The command line for that should be:

native-image --bundle-create=mybundle.nib --dry-run ...other native-image arguments...

Build Bundles File Format

A <imagename>.nib file is a regular jar-file that contains all information needed to bundle a previous build.
For example, the alaunch.nib build bundle has the following inner structure:

alaunch.nib
├── input
│   ├── auxiliary <- Contains auxiliary files passed to native-image via arguments
│   │                (e.g. external `config-*.json` files or PGO `*.iprof`-files)
│   ├── classes <- Contains all class-path and module-path entries passed to the builder
│   │   ├── cp
│   │   │   ├── aux.jar
│   │   │   ├── logging.jar
│   │   │   ├── other.jar
│   │   │   └── ourclasses
│   │   └── p
│   │       ├── app.launcher.jar
│   │       └── mycorp.base.jar
│   └── stage
│       ├── all.env <- All environment variables used in the image build
│       ├── all.properties  <- All system properties passed to the builder
│       ├── build.cmd <- Full native-image command line (minus --bundle-create option)
│       ├── run.cmd <- Arguments to run application on java (for laucher, see below) 
│       └── container <- For containerized build this subdirs holds all info about that 
│           ├── Dockerfile <- Container image that was used to perform the build
│           ├── run.cmd <- Arguments passed to docker/podman to run the container
│           └── setup.json <- Info about the docker/podman setup that was used
│                             * Linux kernel version
│                             * Docker/podman version
│                             * CGroup v2 or CGroup v1
├── output
│   ├── debug
│   │   ├── alaunch.debug <- Native debuginfo for the built image.
│   │   └── sources <- Reachable sources needed for native debugging.
│   └── build
│       └── report <- Contains information about the build process.
│           │         When rebuilding, these will be compared against. 
│           ├── analysis_results.json
│           ├── build_artifacts.json
│           ├── build.log
│           ├── build_output.json
│           ├── jni_access_details.json
│           └── reflection_details.json
├── META-INF
│   ├── MANIFEST.MF <- Specifes nibundle/Launcher as mainclass
│   └── nibundle.properties <- Contains build bundle version info:
│                     * build bundle format version
│                     * Platform the bundle was created on (e.g. linux-amd64) 
│                     * GraalVM / Native-image version used for build
└── nibundle
    └── Launcher.class <- Launcher for running of application with `java`
                          (uses files from input directory)

As we can see, there are several components in a build bundle that we need to describe in more detail.

META-INF:

Since the bundle is also a regular jar-file we have a META-INF subdirectory with the familiar MANIFEST.MF. The
bundle can be used like a regular jar-launcher (by running command java -jar <imagename>.nib) so that the
application we build an image from is instead executed on the JVM. For that purpose the MANIFEST.MF specifies the
nibundle/Launcher as main class. Is is particularly useful if you want to run the application on the JVM with the native-image agent to collect configuration data that you then integrate into the bundle as a second step.

Here we also find nibundle.properties. This file is specific to build bundles. Its existence makes clear that this is no
ordinary jar-file but a native image build bundle. The file contains version information of the native image build
bundle format itself and also which GraalVM version was used to create the bundle. This can later be used to report a
warning message if a bundle gets built with a GraalVM version different from the one used to create the bundle.
This file also contains information about the platform the bundle was created on (e.g. linux-amd64 or
darwin-aarch64).

input:

This directory contains the entire amount of information needed to redo the previous image build. The original
class-path and module-path entries are placed into corresponding files (for jar-files) and subdirectories (for
directory-based class/module-path entries) into the input/classes/cp (original -cp/--class-path entries) and the
input/classes/p (original -p/--module-path entries) folders. The input/stage folder contains all information
needed to replicate the previous build context.

input/stage:

Here we have build.cmd that contains all native-image command line options used in the previous build. Note that
even the initial build that created the bundle already uses a class- and/or module-path that refers to the contents
of the input/classes folder
. This way we can guarantee that a bundle build sees exactly the same relocated
class/module-path entries as the initial build. The use of run.cmd is explained later.

File all.env contains the environment variables that we allowed the builder to see during the initial build and
all.properties the respective system-properties.

input/stage/container:

If the image builder runs in a container environment, this subdirectory holds all information necessary to redo the
image build later in an equivalent container environment. It contains the Dockerfile that was used to specify the
container image that executed the image builder. Next, run.cmd contains all the arguments that were passed to
docker/podman. It does not contain the arguments passed to be builder. In setup.json we save all information about
the container environment that was used (Linux kernel version, CGroup V1 or V2, Docker/podman version). For more info
see below.

output:

This folder contains all the output that was generated by the image build process (if the image was built as part of bundle creation). This contains debuginfo needed in case we need to debug the image at some point in the future.

output/build:

This folder is used to document the build process that lead to the image that was created alongside the bundle.
The report sub-folder holds build.log. It is equivalent to what would have been created if the user had appended
|& tee build.log to the original native-image command line. Additionally, we have several json-files:

  • analysis_results.json: Contains the results of the static analysis. A rerun should compare the new
    analysis_results.json file with this one and report deviations in a user-friendly way.
  • build_artifacts.json: Contains a list of the artifacts that got created during the initial build. As before,
    changes should be reported to the user.
  • build_output.json: Similar information as build.log but more structured and detailed.
  • jni_access_details.json: Overview which methods/classes/fields have been made jni-accessible for image-runtime.
  • reflection_details.json: Same kind of information for reflection access at image runtime.

As already mentioned a rebuild should compare its newly generated set of json-files against the one in the bundle and
report deviations from the original ones in a user-friendly way.

nibundle:

Contains the Launcher.class that is used when the bundle is run as a regular java launcher. The class-file is not
specific to a particular bundle. Instead, the Launcher class extracts the contents of the input into a temporary
subdirectory in $TEMP and uses the files from input/stage/all.* and input/stage/run.cmd to invoke
$JAVA_HOME/bin/java with the environment-variables and with the arguments (e.g. system-properties) needed to run the
application on the JVM.

Enforced sanitized image building

Containerized image building on supported platforms

If available, docker/podman should be used to run the image builder inside a well-defined container image. This allows
us to prevent the builder from using the network during image build
, thus guaranteeing that the image build result did
not depend on some unknown (and therefore unreproducible) network state. Another advantage is that we can mount
input/classes and $GRAALVM_HOME read-only into the container and only allow read-write access to the mounted out
and build directories. This will prevent the application code that runs at image build time to mess with anything
other than those directories. All information about containerized building is recorded in bundle subdirectory
input/stage/container.

Fallback for systems without container support

If containerized builder execution is not possible we can still at least have the builder run in a sanitized
environment variable state
and make sure that only environment variables are visible that were explicitly
specified with -E<env_var_name>=<env_var_value> or -E<env_var_name>
(to allow passing through from the
surrounding environment).

Handling of Image build errors

To ensure build bundles are feasible for the second use case described above we have to make sure a
bundle gets successfully created even if the image build fails. Most likely in this case the out folder will be
missing in the bundle. But as usual build/report/build.log will contain all the command line output that was shown
during the image build. This also includes any error messages that resulted in the build failure.

Metadata

Metadata

Assignees

Type

No type

Projects

Status

Released

Relationships

None yet

Development

No branches or pull requests

Issue actions