Skip to content

Commit ec96d34

Browse files
rvesseMarcelo Vanzin
authored andcommitted
[SPARK-25745][K8S] Improve docker-image-tool.sh script
## What changes were proposed in this pull request? Adds error checking and handling to `docker` invocations ensuring the script terminates early in the event of any errors. This avoids subtle errors that can occur e.g. if the base image fails to build the Python/R images can end up being built from outdated base images and makes it more explicit to the user that something went wrong. Additionally the provided `Dockerfiles` assume that Spark was first built locally or is a runnable distribution however it didn't previously enforce this. The script will now check the JARs folder to ensure that Spark JARs actually exist and if not aborts early reminding the user they need to build locally first. ## How was this patch tested? - Tested with a `mvn clean` working copy and verified that the script now terminates early - Tested with bad `Dockerfiles` that fail to build to see that early termination occurred Closes #22748 from rvesse/SPARK-25745. Authored-by: Rob Vesse <[email protected]> Signed-off-by: Marcelo Vanzin <[email protected]>
1 parent 43717de commit ec96d34

File tree

1 file changed

+31
-10
lines changed

1 file changed

+31
-10
lines changed

bin/docker-image-tool.sh

Lines changed: 31 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ function image_ref {
4444
function build {
4545
local BUILD_ARGS
4646
local IMG_PATH
47+
local JARS
4748

4849
if [ ! -f "$SPARK_HOME/RELEASE" ]; then
4950
# Set image build arguments accordingly if this is a source repo and not a distribution archive.
@@ -53,26 +54,38 @@ function build {
5354
# the examples directory is cleaned up before generating the distribution tarball, so this
5455
# issue does not occur.
5556
IMG_PATH=resource-managers/kubernetes/docker/src/main/dockerfiles
57+
JARS=assembly/target/scala-$SPARK_SCALA_VERSION/jars
5658
BUILD_ARGS=(
5759
${BUILD_PARAMS}
5860
--build-arg
5961
img_path=$IMG_PATH
6062
--build-arg
61-
spark_jars=assembly/target/scala-$SPARK_SCALA_VERSION/jars
63+
spark_jars=$JARS
6264
--build-arg
6365
example_jars=examples/target/scala-$SPARK_SCALA_VERSION/jars
6466
--build-arg
6567
k8s_tests=resource-managers/kubernetes/integration-tests/tests
6668
)
6769
else
68-
# Not passed as an argument to docker, but used to validate the Spark directory.
70+
# Not passed as arguments to docker, but used to validate the Spark directory.
6971
IMG_PATH="kubernetes/dockerfiles"
72+
JARS=jars
7073
BUILD_ARGS=(${BUILD_PARAMS})
7174
fi
7275

76+
# Verify that the Docker image content directory is present
7377
if [ ! -d "$IMG_PATH" ]; then
7478
error "Cannot find docker image. This script must be run from a runnable distribution of Apache Spark."
7579
fi
80+
81+
# Verify that Spark has actually been built/is a runnable distribution
82+
# i.e. the Spark JARs that the Docker files will place into the image are present
83+
local TOTAL_JARS=$(ls $JARS/spark-* | wc -l)
84+
TOTAL_JARS=$(( $TOTAL_JARS ))
85+
if [ "${TOTAL_JARS}" -eq 0 ]; then
86+
error "Cannot find Spark JARs. This script assumes that Apache Spark has first been built locally or this is a runnable distribution."
87+
fi
88+
7689
local BINDING_BUILD_ARGS=(
7790
${BUILD_PARAMS}
7891
--build-arg
@@ -85,29 +98,37 @@ function build {
8598
docker build $NOCACHEARG "${BUILD_ARGS[@]}" \
8699
-t $(image_ref spark) \
87100
-f "$BASEDOCKERFILE" .
88-
if [[ $? != 0 ]]; then
89-
error "Failed to build Spark docker image."
101+
if [ $? -ne 0 ]; then
102+
error "Failed to build Spark JVM Docker image, please refer to Docker build output for details."
90103
fi
91104

92105
docker build $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \
93106
-t $(image_ref spark-py) \
94107
-f "$PYDOCKERFILE" .
95-
if [[ $? != 0 ]]; then
96-
error "Failed to build PySpark docker image."
97-
fi
98-
108+
if [ $? -ne 0 ]; then
109+
error "Failed to build PySpark Docker image, please refer to Docker build output for details."
110+
fi
99111
docker build $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \
100112
-t $(image_ref spark-r) \
101113
-f "$RDOCKERFILE" .
102-
if [[ $? != 0 ]]; then
103-
error "Failed to build SparkR docker image."
114+
if [ $? -ne 0 ]; then
115+
error "Failed to build SparkR Docker image, please refer to Docker build output for details."
104116
fi
105117
}
106118

107119
function push {
108120
docker push "$(image_ref spark)"
121+
if [ $? -ne 0 ]; then
122+
error "Failed to push Spark JVM Docker image."
123+
fi
109124
docker push "$(image_ref spark-py)"
125+
if [ $? -ne 0 ]; then
126+
error "Failed to push PySpark Docker image."
127+
fi
110128
docker push "$(image_ref spark-r)"
129+
if [ $? -ne 0 ]; then
130+
error "Failed to push SparkR Docker image."
131+
fi
111132
}
112133

113134
function usage {

0 commit comments

Comments
 (0)