-
Notifications
You must be signed in to change notification settings - Fork 643
ci: For Linux python wheel generation, save ccache #4913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Larry Gritz <[email protected]>
@zachlewis You set the wheel building up. Do you have any insight here? I'll also note that there are a few places in the wheel.yml workflow that you set environment variables, but as far as I understand now, they will NOT be seen inside the container unless they are one of the CIBW ones. I believe (though learning the hard way while doing this work) that any arbitrary env variables you need to get all the way to the build have to be listedin the CIBW_ENVIRONMENT_PASS variables, or be set directly in the pyproject.toml file. |
See this log to see the messages about how it can't find ccache. |
Hey Larry! I apologize, I've been very distracted these past several weeks. I'm really not too certain about what's going wrong with the ccache stuff -- I can look into it. That said, there might be an easier way. CIBW can incrementally build wheels for each cpython (and pypy) interpreter available for the platform on a single runner / in a single job -- in fact, that's CIBW's default behavior. I've found that the trick to getting CIBW to use previously "autobuilt" dependencies instead of re-auto-building dependencies not found at the system level is setting the env var Two things to note:
Hmm... that's interesting. Maybe environment variables prefixed with |
Oh, that's fantastic news. I had no idea! It's not just the auto-built dependencies that can be amortized. The vast majority of OIIO itself has nothing to do with which Python version is used, and will be virtually free to compile (after the first variant) if ccache is operating. It's really only the few cpp files that constitute the python bindings that are different for each wheel of a given platform. Ideally, we would have only one "job" for each platform, which within the job would separately and sequentially build the variants for each python version. As long as ccache is installed, they should absolutely FLY. |
Not at all! For any of the scenarios we're discussing, ccache is going to be responsible for achieving the majority of the savings. Making huge improvements to wheel building speed with very little work is currently held up only by the following question:
|
I might download and build ccache from source, if I have to, just to prove out how much savings we would get. But I had really hoped to simply install a pre-built version via the package manager. |
Understood! I'll see what I can dig up. |
Tried that already. |
I suspect that the problem is something like an unusually limited set of remote repos being known to yum as set up in the container, and ccache not being in them, instead needing some other repo to be activated for ccache to be found. That's the kind of thing I'm expecting it to turn out to be. |
Hmm. Yeah, that's very strange. |
I will try that! |
OK, I tried both downloading ccache binaries, as well as building it from source. I can make either of those "work" in the sense of building and then using ccache for the compilation. BUT It's deceptively difficult to make this work in practice! For our usual CI run, the entire CI job, including each individual Actions step, runs in the container image you've chosen (for example, one of the aswf-docker images). In other words, you're doing the compilation (using the ccache cache) in the same container as the steps where you save the ccache or restore it from GHA's caches. But in the wheel workflow, the GHA actions are happening on the bare Ubuntu runner, but the build itself is happening within a container set up by the one "cibuildwheel" action. So the long and short of it is, I haven't identified any directory where I can put the ccache cache files that exist "outside the build container" and thus would have the files visible for the subsequent cache-save step! Still plugging away at it... |
FWIW, I think cibuildwheel ultimately copies anything under the container's |
Ah, ok, maybe there is a way... |
The regular CI workflow uses ccache when compiling, and GHA cache action to save and then reseed the .ccache directory from run to run, to dramatically speed up compilation for repeated submissions along a branch or for subsequent pushes of a PR in progress.
The wheel workflow doesn't do this, and is annoyingly slow, but it looks like it should benefit greatly from caching like this. Note that even within a single run, each platform does several builds that differ only by which python version they use, which means 95% of compilation units are not affected by the change of python.
This PR tries to prototype adding GHA caching (just to the Intel Linux jobs, to test it), but dammit, for the life of me, I cannot seem to get ccache itself installed in the container that's used where the wheel is built.
I'm hoping that by submitting this as a draft for everyone to see, somebody will be able to tell me what to do to fix it.
The key is the setting of
CIBW_BEFORE_BUILD
env variable, which gives commands that run inside the container before the build. The bottom line is that theyum install -y ccache
is saying that no ccache package is available, and I can't imagine why. Does anybody know?