Upgrade OCP4 Cluster
Prerequisites
-
Incase an upgrade fails, it is wise to first take an
etcd
backup. To do so follow the SOP [2]. -
Ensure that all installed Operators are at the latest versions for their channel [5].
-
Ensure that the latest
oc
client rpm is available at/srv/web/infra/bigfiles/openshiftboot/oc-client/
on thebatcave01
server. Retrieve the RPM from [6] choose theOpenshift Clients Binary
rpm. Rename rpm tooc-client.rpm
-
Ensure that the
sudo rbac-playbook manual/ocp4-sysadmin-openshift.yml -t "upgrade-rpm"
playbook is run to install this updated oc client rpm.
Upgrade OCP
At the time of writing the version installed on the cluster is 4.8.11
and the upgrade channel
is set to stable-4.8
. It is easiest to update the cluster via the web console. Go to:
-
Administration
-
Cluster Settings
-
In order to upgrade between
z
orpatch
version (x.y.z), when one is available, click the update button. -
When moving between
y
orminor
versions, you must first switch theupgrade channel
tofast-4.9
as an example. You should also be on the very latestz
/patch
version before upgrading. -
When the upgrade has finished, switch back to the
upgrade channel
for stable.
Upgrade failures
In the worst case scenario we may have to restore etcd from the backups taken at the start [4]. Or reinstall a node entirely.
Troubleshooting
There are many possible ways an upgrade can fail mid way through.
-
Check the monitoring alerts currently firing, this can often hint towards the problem
-
Often individual nodes are failing to take the new MachineConfig changes and will show up when examining the
MachineConfigPool
status. -
Might require a manual reboot of that particular node
-
Might require killing pods on that particular node
Want to help? Learn how to contribute to Fedora Docs ›