Skip to content

Conversation

@BenTheElder
Copy link
Member

@BenTheElder BenTheElder commented Jul 29, 2024

What type of PR is this?

/kind failing-test

What this PR does / why we need it:

Enables using a pull-through cache for images in kube-up.sh

Part of resolving 5k node scale test issues #126366

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jul 29, 2024
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jul 29, 2024
@BenTheElder
Copy link
Member Author

/sig testing scalability

@k8s-ci-robot k8s-ci-robot added sig/testing Categorizes an issue or PR as relevant to SIG Testing. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 29, 2024
@BenTheElder BenTheElder changed the title [WIP] kube-up.sh: drop unnecessary legacy mirror config, enable injecting r… [WIP] kube-up.sh: drop unnecessary legacy mirror config, enable injecting registry mirror Jul 29, 2024
@k8s-ci-robot k8s-ci-robot added area/provider/gcp Issues or PRs related to gcp provider approved Indicates a PR has been approved by an approver from all required OWNERS files. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. labels Jul 29, 2024
@BenTheElder BenTheElder force-pushed the 5k-mirror branch 3 times, most recently from 511e332 to 9c9c0d6 Compare July 29, 2024 21:11
EOF

# DO NOT MERGE -- Testing
KUBERNETES_REGISTRY_PULL_THROUGH_HOST='https://us-central1-docker.pkg.dev/v2/k8s-infra-e2e-scale-5k-project/k8s-5k-scale-cache/'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wiped the images in this just before the last run but it doesn't look like we populated any, so something is not quite right here (or the timing was off, but I don't think so).

Need to investigate this further before proceeding.

Copy link
Member Author

@BenTheElder BenTheElder Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after fixing #126448 (comment), I see many many images pulled through

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 29, 2024
# NOTE: we need literal double quotes around some of these values
echo 'server="'"${KUBERNETES_REGISTRY_PULL_THROUGH_HOST}"'"'
echo ''
echo '[hosts."'"${KUBERNETES_REGISTRY_PULL_THROUGH_HOST}"'"]'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be host.

@BenTheElder
Copy link
Member Author

kubernetes/test-infra#33158 will allow manually vetting this fully

@BenTheElder
Copy link
Member Author

/test pull-kubernetes-e2e-gce-pull-through-cache

@k8s-ci-robot

This comment was marked as resolved.

@BenTheElder
Copy link
Member Author

/test pull-kubernetes-e2e-gce-pull-through-cache

1 similar comment
@BenTheElder
Copy link
Member Author

/test pull-kubernetes-e2e-gce-pull-through-cache

@BenTheElder
Copy link
Member Author

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 30, 2024
@BenTheElder
Copy link
Member Author

/test pull-kubernetes-e2e-gce-pull-through-cache

@BenTheElder
Copy link
Member Author

/test pull-kubernetes-e2e-gce-pull-through-cache

@BenTheElder
Copy link
Member Author

/test pull-kubernetes-e2e-gce-pull-through-cache

@BenTheElder
Copy link
Member Author

BenTheElder commented Jul 31, 2024

This is working as intended now. I can see the images being pulled through the cache in the test job and peeked into the e2e cluster nodes to confirm the generated config is as intended.

run: https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/126448/pull-kubernetes-e2e-gce-pull-through-cache/1818444296806731776

@BenTheElder
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 31, 2024
EOF
if [[ -n "${KUBERNETES_REGISTRY_PULL_THROUGH_BASIC_AUTH_TOKEN_PATH:-}" ]]; then
cat >>"$file" <<EOF
KUBERNETES_REGISTRY_PULL_THROUGH_BASIC_AUTH_TOKEN: $(yaml-quote "$(cat "${KUBERNETES_REGISTRY_PULL_THROUGH_BASIC_AUTH_TOKEN_PATH}")")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to plumb this through to the nodes by placing it in the big env file we copy up here.

It does not appear to be stored in uploaded ARTIFACTS or logged in logs.

@BenTheElder
Copy link
Member Author

And now https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/126448/pull-kubernetes-e2e-gce-pull-through-cache/1818444296806731776 has passed with these enabled, and I can see many images populated into the cache:

Screenshot 2024-07-30 at 6 22 00 PM

This should let us isolate high demand from scale testing in #126366

We would want end-users to do something similar, and we have docs for this in https://registry.k8s.io

@dims
Copy link
Member

dims commented Jul 31, 2024

/approve
/lgtm

thanks @BenTheElder

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 31, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 3c2f39546b5a42d40b15120f84e403aff49f0e12

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder, dims

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/gcp Issues or PRs related to gcp provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note-none Denotes a PR that doesn't merit a release note. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants