Skip to content

Revert "W/A NFS server becoming unreachable mid run" #14362

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 30, 2025

Conversation

jean-edouard
Copy link
Contributor

What this PR does

This reverts the workaround added by #13443, as I believe it is now hurting us more than helping.

Fixes #

Why we need it and why it was done in this way

The following tradeoffs were made:

The following alternatives were considered:

Links to places where the discussion took place:

Special notes for your reviewer

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Release note

NONE

@kubevirt-bot kubevirt-bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. sig/buildsystem Denotes an issue or PR that relates to changes in the build system. size/S labels Mar 27, 2025
@kubevirt-bot kubevirt-bot requested review from enp0s3 and xpivarc March 27, 2025 15:26
@jean-edouard
Copy link
Contributor Author

/cc @akalenyu
/cc @fossedihelm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 27, 2025
@fossedihelm
Copy link
Contributor

/lgtm
/cc @brianmcarey

@enp0s3
Copy link
Contributor

enp0s3 commented Mar 27, 2025

/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enp0s3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 27, 2025
@kubevirt-commenter-bot
Copy link

Required labels detected, running phase 2 presubmits:
/test pull-kubevirt-e2e-windows2016
/test pull-kubevirt-e2e-kind-1.30-vgpu
/test pull-kubevirt-e2e-kind-sriov
/test pull-kubevirt-e2e-k8s-1.32-ipv6-sig-network
/test pull-kubevirt-e2e-k8s-1.30-sig-network
/test pull-kubevirt-e2e-k8s-1.30-sig-storage
/test pull-kubevirt-e2e-k8s-1.30-sig-compute
/test pull-kubevirt-e2e-k8s-1.30-sig-operator
/test pull-kubevirt-e2e-k8s-1.31-sig-network
/test pull-kubevirt-e2e-k8s-1.31-sig-storage
/test pull-kubevirt-e2e-k8s-1.31-sig-compute
/test pull-kubevirt-e2e-k8s-1.31-sig-operator

Copy link
Member

@brianmcarey brianmcarey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - no issues with it. Can you just expand on why you think it is hurting more than helping?

Its a little too soon to say but it looks like we found a way of getting stable compute-migrations lanes again - #14354

I would prefer to take in #14354 before merging this just to limit the change see and to see if it helps.

@brianmcarey
Copy link
Member

/hold

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 27, 2025
@jean-edouard
Copy link
Contributor Author

Looks good to me - no issues with it. Can you just expand on why you think it is hurting more than helping?

Just in case the nfs server pod ever restarts

Its a little too soon to say but it looks like we found a way of getting stable compute-migrations lanes again - #14354

That's not a bad idea, many tests there are quite resource-heavy, but I doubt that will solve the NFS issues we're seeing...

I would prefer to take in #14354 before merging this just to limit the change see and to see if it helps.

Absolutely, 1 test at a time is the way to go

@brianmcarey
Copy link
Member

brianmcarey commented Mar 27, 2025

Looks good to me - no issues with it. Can you just expand on why you think it is hurting more than helping?

Just in case the nfs server pod ever restarts

+1 - that would be a problem

Its a little too soon to say but it looks like we found a way of getting stable compute-migrations lanes again - #14354

That's not a bad idea, many tests there are quite resource-heavy, but I doubt that will solve the NFS issues we're seeing...

I have 10 runs so far without hitting the nfs timeouts - at the current failure rate on main I would expect to hit it at least once or twice in 10 but you're right - proof is in the pudding - will have to see how it goes on main.

I would prefer to take in #14354 before merging this just to limit the change see and to see if it helps.

Absolutely, 1 test at a time is the way to go

Cheers - thanks.

But yeah we should look at removing this WA as soon as we can.

@jean-edouard
Copy link
Contributor Author

See also kubevirt/kubevirtci#1407

Copy link
Member

@brianmcarey brianmcarey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold cancel

This workaround did not improve the compute-migrations stability as we had hoped and there is a risk that this can break if pod restarts occur.

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 28, 2025
This reverts commit 24872e9.

Signed-off-by: Jed Lejosne <jed@redhat.com>
@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 28, 2025
@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 30, 2025
@kubevirt-commenter-bot
Copy link

Required labels detected, running phase 2 presubmits:
/test pull-kubevirt-e2e-windows2016
/test pull-kubevirt-e2e-kind-1.30-vgpu
/test pull-kubevirt-e2e-kind-sriov
/test pull-kubevirt-e2e-k8s-1.32-ipv6-sig-network
/test pull-kubevirt-e2e-k8s-1.30-sig-network
/test pull-kubevirt-e2e-k8s-1.30-sig-storage
/test pull-kubevirt-e2e-k8s-1.30-sig-compute
/test pull-kubevirt-e2e-k8s-1.30-sig-operator
/test pull-kubevirt-e2e-k8s-1.31-sig-network
/test pull-kubevirt-e2e-k8s-1.31-sig-storage
/test pull-kubevirt-e2e-k8s-1.31-sig-compute
/test pull-kubevirt-e2e-k8s-1.31-sig-operator

@kubevirt-bot kubevirt-bot merged commit 60d8d14 into kubevirt:main Mar 30, 2025
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/buildsystem Denotes an issue or PR that relates to changes in the build system. size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants