This post covers another one of the frustrating problems I had during my initial foray into Open Liberty InstantOn.
Clearly this issue related to my knowledge, or lack thereof, of how to properly configure the OpenShift deployment yaml file when building Open Liberty InstantOn images.
Being new to OpenShift I had little appreciation of what Service Accounts, Security Contexts and Security Context Constraints were, let alone how they worked, and where the settings need to be placed in the deployment yaml file.
Reading time around 10 minutes.
Errors covered in this post
Error | Solution |
Error creating: pods “liberty-to-openshift-instanton-85fbc954dd-” is forbidden: unable to validate against any security context constraint: โฆ Invalid value: “CHECKPOINT_RESTORE”: capability may not be added, provider restricted-v2: .containers[0].capabilities.add: Invalid value: “SETPCAP”: | The deployment yaml is missing the serviceAccount or securityConstraint |
serviceaccount “liberty-to-openshift-instanton” not found | The Service Account does not exist |
/opt/ol/wlp/bin/server: line 1373: /opt/criu/criu: Operation not permitted CWWKE0961I: Restoring the checkpoint server process failed. | In this example, the serviceAccount and securityContext is in the wrong location in the deployment yaml. |
If you’d like to test these errors
If you’d like to test these errors, complete How to deploy an Open Liberty InstantOn app, up to the Security Context Constraint section.
Alternatively, to view the files referenced in this post, you can download them from this URL.
Ensure your command window is in the correct directory.
cd /d c:\ocp\LibertyToOpenShiftInstantOn
Not adding the serviceAccount and securityContext sections
When deploying an Open Liberty InstantOn app, you need to specify the serviceAccount and securityContext values.
This section shows what happens when these values are missing.
First, ensure you are on the correct project.
oc project liberty-to-openshift-instanton
Deploy the app using a deployment yaml file that doesn’t contain the required values.
oc apply -f deploy_missing_sa_and_sc.yaml
Command output
c:\ocp\LibertyToOpenShiftInstantOn>oc apply -f deploy_missing_sa_and_sc.yaml
Warning: would violate PodSecurity "restricted:v1.24": allowPrivilegeEscalation != false (container "liberty-to-openshift-instanton" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "liberty-to-openshift-instanton" must not include "CHECKPOINT_RESTORE", "SETPCAP" in securityContext.capabilities.add), seccompProfile (pod or container "liberty-to-openshift-instanton" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
deployment.apps/liberty-to-openshift-instanton created
service/liberty-to-openshift-instanton created
route.route.openshift.io/liberty-to-openshift-instanton created
Check for any running pods.
oc get po
The output shows that the “liberty-to-openshift-instanton” pod is not running.
c:\ocp\LibertyToOpenShiftInstantOn>oc get po
No resources found in liberty-to-openshift-instanton namespace.
Now check the events for any errors.
oc get events
Check for following the error.
c:\ocp\LibertyToOpenShiftInstantOn>oc get events
LAST SEEN TYPE REASON OBJECT MESSAGE
1s Warning FailedCreate replicaset/liberty-to-openshift-instanton-85fbc954dd Error creating: pods "liberty-to-openshift-instanton-85fbc954dd-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted-v2: .containers[0].capabilities.add: Invalid value: "CHECKPOINT_RESTORE": capability may not be added, provider restricted-v2: .containers[0].capabilities.add: Invalid value: "SETPCAP": capability may not be added, provider restricted-v2: .containers[0].allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed, provider "restricted": Forbidden: not usable by user or serviceaccount, provider "nonroot-v2": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork-v2": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "hostpath-provisioner": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]
83s Normal ScalingReplicaSet deployment/liberty-to-openshift-instanton Scaled up replica set liberty-to-openshift-instanton-85fbc954dd to 1
I personally find this error message difficult to read and understand, especially for people like me still learning.
This error means there is something wrong with the Service Account (SA), Security Context (SC) or Security Context Constraint (SCC).
Your OpenShift admin can assist here, but to try to debug this error,
- Check whether you created the Security Account:
- oc create serviceaccount liberty-to-openshift-instanton-sa
- Check whether your OpenShift cluster admin created the Security Context Constraint:
- oc apply -f scc-cap-cr-minmal.yaml
- Check that you added the serviceAccountName vaue in the “containers” section in the deploy.yaml file
- serviceAccountName: liberty-to-openshift-instanton-sa
- Check that you added the securityContext value in the “containers” section in the deploy.yaml file
Delete the failed deployment.
oc delete -f deploy_missing_sa_and_sc.yaml
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc delete -f deploy_missing_sa_and_sc.yaml
deployment.apps "liberty-to-openshift-instanton" deleted
service "liberty-to-openshift-instanton" deleted
route.route.openshift.io "liberty-to-openshift-instanton" deleted
Service Account (SA) does not exist
If the Service Account is not created, you’ll receive an error message when deploying the app.
For this example, ensure you haven’t created the Service Account.
Deploy the app using the following file.
oc apply -f deploy_incorrectly_placed_in_spec.yaml
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc apply -f deploy_incorrectly_placed_in_spec.yaml
deployment.apps/liberty-to-openshift-instanton created
service/liberty-to-openshift-instanton created
route.route.openshift.io/liberty-to-openshift-instanton created
Check for any running pods.
oc get po
The output shows that the “liberty-to-openshift-instanton” pod is not running.
c:\ocp\LibertyToOpenShiftInstantOn>oc get po
No resources found in liberty-to-openshift-instanton namespace.
Now check the events for any errors.
oc get events
Check for following the error.
c:\ocp\LibertyToOpenShiftInstantOn>oc get events
LAST SEEN TYPE REASON OBJECT MESSAGE
12s Warning FailedCreate replicaset/liberty-to-openshift-instanton-9fc5d9dc5 Error creating: pods "liberty-to-openshift-instanton-9fc5d9dc5-" is forbidden: error looking up service account liberty-to-openshift-instanton/liberty-to-openshift-instanton: serviceaccount "liberty-to-openshift-instanton" not found
32s Normal ScalingReplicaSet deployment/liberty-to-openshift-instanton Scaled up replica set liberty-to-openshift-instanton-9fc5d9dc5 to 1
In this example, you need to create the service account.
Delete the failed deployment.
oc delete -f deploy_incorrectly_placed_in_spec.yaml
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc delete -f deploy_incorrectly_placed_in_spec.yaml
deployment.apps "liberty-to-openshift-instanton" deleted
service "liberty-to-openshift-instanton" deleted
route.route.openshift.io "liberty-to-openshift-instanton" deleted
Adding to the “spec” section rather than the “containers” section
Initially I placed the serviceAccount and securityContext in the “spec” section of the deployment yaml file rather that the “containers” section of the yaml file.
In this case, I copied an example I found in github. This github example used the OpenLibertyApplication deployment type (Open Liberty Operator). I’m not sure whether this has any bearing on this error – I am yet to try out.
The following deployment yaml file highlights what I did.
apiVersion: apps/v1
kind: Deployment
metadata:
name: liberty-to-openshift-instanton
labels:
app: liberty-to-openshift-instanton
spec:
selector:
matchLabels:
app: liberty-to-openshift-instanton
template:
metadata:
labels:
app: liberty-to-openshift-instanton
spec:
# Add InstantOn - this is in the wrong spot, it should be under container
serviceAccountName: liberty-to-openshift-instanton
securityContext:
runAsNonRoot: true
privileged: false
allowPrivilegeEscalation: true
capabilities:
add:
- CHECKPOINT_RESTORE
- SETPCAP
drop:
- ALL
# Add InstantOn
containers:
- name: libertytoopenshiftinstanton-container
image: image-registry.openshift-image-registry.svc:5000/liberty-to-openshift-instanton/liberty-to-openshift-instanton:olp-java17-1.0
ports:
- containerPort: 9080
Create the Service Account if it does not already exist.
oc create serviceaccount liberty-to-openshift-instanton-sa
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc create serviceaccount liberty-to-openshift-instanton-sa
serviceaccount/liberty-to-openshift-instanton-sa created
This example assumes the Security Context Constraint also exists with the correct role binding and that the only problem is with the deployment yaml fie.
Switch to the kubeadmin user.
oc login -u kubeadmin https://api.crc.testing:6443
Command output
c:\ocp\LibertyToOpenShiftInstantOn>oc login -u kubeadmin https://api.crc.testing:6443
Logged into "https://api.crc.testing:6443" as "kubeadmin" using existing credentials.
You have access to 71 projects, the list has been suppressed. You can list all projects with 'oc projects'
Using project "liberty-to-openshift-instanton".
Ensure you are connected to the correct project.
oc project liberty-to-openshift-instanton
Command output
C:\ocp\LibertyToOpenShiftInstantOn>oc project liberty-to-openshift-instanton
Already on project "liberty-to-openshift-instanton" on server "https://api.crc.testing:6443".
Now create the SCC.
oc apply -f scc-cap-cr-minmal.yaml
Command output
c:\ocp\LibertyToOpenShiftInstantOn>oc apply -f scc-cap-cr-minmal.yaml
securitycontextconstraints.security.openshift.io/liberty-to-openshift-instanton created
Verify the SCC was created. Similar to the Service Account (SA), I have added a ‘-scc’ suffix to help distinguish between the SA.
oc get scc liberty-to-openshift-instanton-scc
Command output
c:\ocp\LibertyToOpenShiftInstantOn>oc get scc liberty-to-openshift-instanton-scc
NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP PRIORITY READONLYROOTFS VOLUMES
liberty-to-openshift-instanton-scc false <no value> MustRunAs MustRunAsNonRoot RunAsAny RunAsAny <no value> false ["awsElasticBlockStore","azureDisk","azureFile","cephFS","cinder","configMap","csi","downwardAPI","emptyDir","ephemeral","fc","flexVolume","flocker","gcePersistentDisk","gitRepo","glusterfs","iscsi","nfs","persistentVolumeClaim","photonPersistentDisk","portworxVolume","projected","quobyte","rbd","scaleIO","secret","storageOS","vsphere"]
Create the role-based mapping between the SCC and the service account.
oc adm policy add-scc-to-user liberty-to-openshift-instanton-scc -z liberty-to-openshift-instanton-sa
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc adm policy add-scc-to-user liberty-to-openshift-instanton-scc -z liberty-to-openshift-instanton-sa
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:liberty-to-openshift-instanton-scc added: "liberty-to-openshift-instanton-sa"
Switch back to the developer account. All remaining steps should be created under this account.
oc login -u developer https://api.crc.testing:6443
All the pre-requisites should now be created:
- Service Account (SA)
- Security Context Constraint (SCC)
- Role binding between the SA and SCC
Deploy the app using the following file.
oc apply -f deploy_incorrectly_placed_in_spec.yaml
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc apply -f deploy_incorrectly_placed_in_spec.yaml
deployment.apps/liberty-to-openshift-instanton created
service/liberty-to-openshift-instanton created
route.route.openshift.io/liberty-to-openshift-instanton created
Check for any running pods.
oc get po
The output shows that the “liberty-to-openshift-instanton” pod is running. All looks good so far.
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc get po
NAME READY STATUS RESTARTS AGE
liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd 1/1 Running 0 9s
Now check the events for any errors. All still looks ok.
oc get events
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc get events
LAST SEEN TYPE REASON OBJECT MESSAGE
17s Normal Scheduled pod/liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd Successfully assigned liberty-to-openshift-instanton/liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd to crc-vlf7c-master-0
14s Normal AddedInterface pod/liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd Add eth0 [10.217.0.70/23] from openshift-sdn
14s Normal Pulled pod/liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd Container image "image-registry.openshift-image-registry.svc:5000/liberty-to-openshift-instanton/liberty-to-openshift-instanton:olp-java17-1.0" already present on machine
14s Normal Created pod/liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd Created container libertytoopenshiftinstanton-container
14s Normal Started pod/liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd Started container libertytoopenshiftinstanton-container
17s Normal SuccessfulCreate replicaset/liberty-to-openshift-instanton-7bb5b9ccd9 Created pod: liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd
17s Normal ScalingReplicaSet deployment/liberty-to-openshift-instanton Scaled up replica set liberty-to-openshift-instanton-7bb5b9ccd9 to 1
The final test is to check the logs of the running pod. Now you’ll see the error.
Issue the command to check the logs.
oc logs liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd
The following shows the error you’ll receive. All I can say is good luck working this error out.
This took me much effort to fix and involved me seeking assistance from an expert in using Open Liberty InstantOn.
c:\ocp\LibertyToOpenShiftInstantOn>oc logs liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd
/opt/ol/wlp/bin/server: line 1373: /opt/criu/criu: Operation not permitted
CWWKE0961I: Restoring the checkpoint server process failed. Check the /logs/checkpoint/restore.log log to determine why the checkpoint process was not restored. Launching the server without using the checkpoint image.
Launching defaultServer (Open Liberty 24.0.0.8/wlp-1.0.92.cl240820240729-1903) on Eclipse OpenJ9 VM, version 17.0.12+7 (en_US)
[AUDIT ] CWWKE0001I: The server defaultServer has been launched.
[AUDIT ] CWWKG0093A: Processing configuration drop-ins resource: /opt/ol/wlp/usr/servers/defaultServer/configDropins/defaults/keystore.xml
[AUDIT ] CWWKG0093A: Processing configuration drop-ins resource: /opt/ol/wlp/usr/servers/defaultServer/configDropins/defaults/open-default-port.xml
[AUDIT ] CWWKZ0058I: Monitoring dropins for applications.
[AUDIT ] CWWKT0016I: Web application available (default_host): http://liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd:9080/
[AUDIT ] CWWKZ0001I: Application LibertyToOpenShiftInstantOn started in 0.240 seconds.
[AUDIT ] CWWKF0012I: The server installed the following features: [el-3.0, jsp-2.3, localConnector-1.0, servlet-4.0].
[AUDIT ] CWWKF0011I: The defaultServer server is ready to run a smarter planet. The defaultServer server started in 1.793 seconds.
In this particular case, check your deployment yaml file and place the serviceAccount and securityConstraint in the correct location.
As can be seen below, I have moved the serviceAccountName and securityContext under the “containers” section.
apiVersion: apps/v1
kind: Deployment
metadata:
name: liberty-to-openshift-instanton
labels:
app: liberty-to-openshift-instanton
spec:
selector:
matchLabels:
app: liberty-to-openshift-instanton
template:
metadata:
labels:
app: liberty-to-openshift-instanton
spec:
containers:
- name: liberty-to-openshift-instanton
image: image-registry.openshift-image-registry.svc:5000/liberty-to-openshift-instanton/liberty-to-openshift-instanton:olp-java17-1.0
# Add InstantOn
serviceAccountName: liberty-to-openshift-instanton-sa
securityContext:
runAsNonRoot: true
privileged: false
allowPrivilegeEscalation: true
capabilities:
add:
- CHECKPOINT_RESTORE
- SETPCAP
drop:
- ALL
# Add InstantOn
ports:
- containerPort: 9080
To test this, tune the following command. This deployment yaml file has these fields in the correct location.
oc apply -f deploy.yaml
If you check the output, you’ll notice the following change.
c:\ocp\LibertyToOpenShiftInstantOn>oc apply -f deploy.yaml
Warning: would violate PodSecurity "baseline:v1.24": non-default capabilities (container "liberty-to-openshift-instanton" must not include "CHECKPOINT_RESTORE" in securityContext.capabilities.add)
deployment.apps/liberty-to-openshift-instanton configured
service/liberty-to-openshift-instanton unchanged
route.route.openshift.io/liberty-to-openshift-instanton unchanged
If you check the running pods
oc get po
And the events
oc get events
All looks the same as before.
But when you check the logs, you’ll see the checkpoint restore was successful. Also note the significantly faster startup time.
oc logs liberty-to-openshift-instanton-6d4764cdcd-5mdbn
c:\ocp\LibertyToOpenShiftInstantOn>oc logs liberty-to-openshift-instanton-6d4764cdcd-5mdbn
JVMJITM043W AOT load and compilation disabled post restore.
[AUDIT ] Launching defaultServer (Open Liberty 24.0.0.8/wlp-1.0.92.cl240820240729-1903) on Eclipse OpenJ9 VM, version 17.0.12+7 (en_US)
[AUDIT ] CWWKT0016I: Web application available (default_host): http://liberty-to-openshift-instanton-6d4764cdcd-5mdbn:9080/
[AUDIT ] CWWKC0452I: The Liberty server process resumed operation from a checkpoint in 0.263 seconds.
[AUDIT ] CWWKZ0001I: Application LibertyToOpenShiftInstantOn started in 0.267 seconds.
[AUDIT ] CWWKF0012I: The server installed the following features: [el-3.0, jsp-2.3, localConnector-1.0, servlet-4.0].
[AUDIT ] CWWKF0011I: The defaultServer server is ready to run a smarter planet. The defaultServer server started in 0.422 seconds.
You have now successfully simulated the error when the serviceAccount and securityContext are placed in the wrong section in the deployment yaml file, plus the solution to the problem.
I hope this saves you the many hours it cost me by this simple mistake.
Delete the deployment.
oc delete -f deploy.yaml
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc delete -f deploy_incorrectly_placed_in_spec.yaml
deployment.apps "liberty-to-openshift-instanton" deleted
service "liberty-to-openshift-instanton" deleted
route.route.openshift.io "liberty-to-openshift-instanton" deleted
You should also cleanup the environment by:
- Deleting the SCC to SA role binding
- Deleting the SCC
- Deleting the SA
Tips
Delete the events while testing
If you are testing these errors using OpenShift Local or in an OpenShift test environment, you can delete the events before each test. This makes identifying the actual error a little easier.
Issue the following command to delete all events for your connected project.
oc delete events --all
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc delete events --all
event "liberty-to-openshift-instanton-9fc5d9dc5.17f443c823ea665d" deleted
event "liberty-to-openshift-instanton-9fc5d9dc5.17f49810b2cfcd8e" deleted
Leave a Reply