Skip to content

Conversation

@Yikun
Copy link
Member

@Yikun Yikun commented Oct 18, 2022

What changes were proposed in this pull request?

Remove unused apt lists cache

Why are the changes needed?

Clean cache to reduce docker image size.

This is also recommanded by docker community:

$ docker run --user 0:0 -ti apache/spark bash
root@5d1ca347279e:/opt/spark/work-dir# ls /var/lib/apt/lists/
auxfiles								     lock
deb.debian.org_debian_dists_bullseye-updates_InRelease			     partial
deb.debian.org_debian_dists_bullseye-updates_main_binary-arm64_Packages.lz4  security.debian.org_debian-security_dists_bullseye-security_InRelease
deb.debian.org_debian_dists_bullseye_InRelease				     security.debian.org_debian-security_dists_bullseye-security_main_binary-arm64_Packages.lz4
deb.debian.org_debian_dists_bullseye_main_binary-arm64_Packages.lz4
root@5d1ca347279e:/opt/spark/work-dir# du --max-depth=1 -h /var/lib/apt/lists/
4.0K	/var/lib/apt/lists/partial
4.0K	/var/lib/apt/lists/auxfiles
17M	/var/lib/apt/lists/

Does this PR introduce any user-facing change?

Yes in some level, image size is reduced.

How was this patch tested?

K8s CI passed

@Yikun Yikun marked this pull request as ready for review October 19, 2022 03:57
@Yikun
Copy link
Member Author

Yikun commented Oct 19, 2022

cc @dongjoon-hyun

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @Yikun .
Merged to master for Apache Spark 3.4.0.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Oct 19, 2022

Ur, @Yikun . Why do you use a different style per repo?

- rm -rf /var/cache/apt/*
+ rm -rf /var/cache/apt/* && \
+ rm -rf /var/lib/apt/lists/*
  • This PR
- rm -rf /var/cache/apt/*
+ rm -rf /var/cache/apt/* && rm -rf /var/lib/apt/lists/*

@Yikun
Copy link
Member Author

Yikun commented Oct 19, 2022

@dongjoon-hyun After spark-docker PR merged, I found the Python docker file are using one line style, so keep the same style in spark repo.

I'd also like to sync this style to spark-docker. : ), from functions view it's same, put similar function cmd (rm in here) in one line is better.

@Yikun
Copy link
Member Author

Yikun commented Oct 19, 2022

BTW, what do you think about the dockerfile being in sync between spark-docker and spark?

spark-docker contains some changes for non-k8s to meet the requirements of DOI. such as, remove print line in entrypoint, add users, expose port. Do we want to also sync them in spark repo?

@dongjoon-hyun
Copy link
Member

We don't need to sync~ If we really want to sync, we had better make spark-docker git repo as a submodule of spark repo.

@Yikun
Copy link
Member Author

Yikun commented Oct 19, 2022

@dongjoon-hyun OK, thanks, let's just keep different for now! If a submodule is really needed, we can do it in future.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Oct 19, 2022

Ya, feel free to propose any changes if it helps you and don't be blocked by me. You are now a committer with more power, @Yikun . :)

SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
### What changes were proposed in this pull request?
Remove unused apt lists cache

### Why are the changes needed?
Clean cache to reduce docker image size.

This is also [recommanded](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#run) by docker community:

```
$ docker run --user 0:0 -ti apache/spark bash
root5d1ca347279e:/opt/spark/work-dir# ls /var/lib/apt/lists/
auxfiles								     lock
deb.debian.org_debian_dists_bullseye-updates_InRelease			     partial
deb.debian.org_debian_dists_bullseye-updates_main_binary-arm64_Packages.lz4  security.debian.org_debian-security_dists_bullseye-security_InRelease
deb.debian.org_debian_dists_bullseye_InRelease				     security.debian.org_debian-security_dists_bullseye-security_main_binary-arm64_Packages.lz4
deb.debian.org_debian_dists_bullseye_main_binary-arm64_Packages.lz4
root5d1ca347279e:/opt/spark/work-dir# du --max-depth=1 -h /var/lib/apt/lists/
4.0K	/var/lib/apt/lists/partial
4.0K	/var/lib/apt/lists/auxfiles
17M	/var/lib/apt/lists/
```

### Does this PR introduce _any_ user-facing change?
Yes in some level, image size is reduced.

### How was this patch tested?
K8s CI passed

Closes apache#38298 from Yikun/SPARK-40513.

Authored-by: Yikun Jiang <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants