, , ,

How to correctly add the serviceAccountName/securityContext

Sean Boyd Avatar

Share with

This post covers another one of the frustrating problems I had during my initial foray into Open Liberty InstantOn.

Clearly this issue related to my knowledge, or lack thereof, of how to properly configure the OpenShift deployment yaml file when building Open Liberty InstantOn images.

Being new to OpenShift I had little appreciation of what Service Accounts, Security Contexts and Security Context Constraints were, let alone how they worked, and where the settings need to be placed in the deployment yaml file.

Reading time around 10 minutes.

Errors covered in this post

ErrorSolution
Error creating: pods “liberty-to-openshift-instanton-85fbc954dd-” is forbidden: unable to validate against any security context constraint: โ€ฆ Invalid value: “CHECKPOINT_RESTORE”: capability may not be added, provider restricted-v2: .containers[0].capabilities.add: Invalid value: “SETPCAP”:The deployment yaml is missing the serviceAccount or securityConstraint
serviceaccount “liberty-to-openshift-instanton” not foundThe Service Account does not exist
/opt/ol/wlp/bin/server: line 1373: /opt/criu/criu: Operation not permitted CWWKE0961I: Restoring the checkpoint server process failed.In this example, the serviceAccount and securityContext is in the wrong location in the deployment yaml.

If you’d like to test these errors

If you’d like to test these errors, complete How to deploy an Open Liberty InstantOn app, up to the Security Context Constraint section.

Alternatively, to view the files referenced in this post, you can download them from this URL.

Ensure your command window is in the correct directory.

cd /d c:\ocp\LibertyToOpenShiftInstantOn

Not adding the serviceAccount and securityContext sections

When deploying an Open Liberty InstantOn app, you need to specify the serviceAccount and securityContext values.

This section shows what happens when these values are missing.

First, ensure you are on the correct project.

oc project liberty-to-openshift-instanton

Deploy the app using a deployment yaml file that doesn’t contain the required values.

oc apply -f deploy_missing_sa_and_sc.yaml
Command output
c:\ocp\LibertyToOpenShiftInstantOn>oc apply -f deploy_missing_sa_and_sc.yaml
Warning: would violate PodSecurity "restricted:v1.24": allowPrivilegeEscalation != false (container "liberty-to-openshift-instanton" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "liberty-to-openshift-instanton" must not include "CHECKPOINT_RESTORE", "SETPCAP" in securityContext.capabilities.add), seccompProfile (pod or container "liberty-to-openshift-instanton" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
deployment.apps/liberty-to-openshift-instanton created
service/liberty-to-openshift-instanton created
route.route.openshift.io/liberty-to-openshift-instanton created

Check for any running pods.

oc get po

The output shows that the “liberty-to-openshift-instanton” pod is not running.

c:\ocp\LibertyToOpenShiftInstantOn>oc get po
No resources found in liberty-to-openshift-instanton namespace.

Now check the events for any errors.

oc get events

Check for following the error.

c:\ocp\LibertyToOpenShiftInstantOn>oc get events
LAST SEEN   TYPE      REASON              OBJECT                                                 MESSAGE
1s          Warning   FailedCreate        replicaset/liberty-to-openshift-instanton-85fbc954dd   Error creating: pods "liberty-to-openshift-instanton-85fbc954dd-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted-v2: .containers[0].capabilities.add: Invalid value: "CHECKPOINT_RESTORE": capability may not be added, provider restricted-v2: .containers[0].capabilities.add: Invalid value: "SETPCAP": capability may not be added, provider restricted-v2: .containers[0].allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed, provider "restricted": Forbidden: not usable by user or serviceaccount, provider "nonroot-v2": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork-v2": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "hostpath-provisioner": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]
83s         Normal    ScalingReplicaSet   deployment/liberty-to-openshift-instanton              Scaled up replica set liberty-to-openshift-instanton-85fbc954dd to 1

I personally find this error message difficult to read and understand, especially for people like me still learning.

This error means there is something wrong with the Service Account (SA), Security Context (SC) or Security Context Constraint (SCC).

Your OpenShift admin can assist here, but to try to debug this error,

  • Check whether you created the Security Account:
    • oc create serviceaccount liberty-to-openshift-instanton-sa
  • Check whether your OpenShift cluster admin created the Security Context Constraint:
    • oc apply -f scc-cap-cr-minmal.yaml
  • Check that you added the serviceAccountName vaue in the “containers” section in the deploy.yaml file
    • serviceAccountName: liberty-to-openshift-instanton-sa
  • Check that you added the securityContext value in the “containers” section in the deploy.yaml file

Delete the failed deployment.

oc delete -f deploy_missing_sa_and_sc.yaml
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc delete -f deploy_missing_sa_and_sc.yaml
deployment.apps "liberty-to-openshift-instanton" deleted
service "liberty-to-openshift-instanton" deleted
route.route.openshift.io "liberty-to-openshift-instanton" deleted

Service Account (SA) does not exist

If the Service Account is not created, you’ll receive an error message when deploying the app.

For this example, ensure you haven’t created the Service Account.

Deploy the app using the following file.

oc apply -f deploy_incorrectly_placed_in_spec.yaml
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc apply -f deploy_incorrectly_placed_in_spec.yaml
deployment.apps/liberty-to-openshift-instanton created
service/liberty-to-openshift-instanton created
route.route.openshift.io/liberty-to-openshift-instanton created

Check for any running pods.

oc get po

The output shows that the “liberty-to-openshift-instanton” pod is not running.

c:\ocp\LibertyToOpenShiftInstantOn>oc get po
No resources found in liberty-to-openshift-instanton namespace.

Now check the events for any errors.

oc get events

Check for following the error.

c:\ocp\LibertyToOpenShiftInstantOn>oc get events
LAST SEEN   TYPE      REASON              OBJECT                                                 MESSAGE
12s         Warning   FailedCreate        replicaset/liberty-to-openshift-instanton-9fc5d9dc5    Error creating: pods "liberty-to-openshift-instanton-9fc5d9dc5-" is forbidden: error looking up service account liberty-to-openshift-instanton/liberty-to-openshift-instanton: serviceaccount "liberty-to-openshift-instanton" not found
32s         Normal    ScalingReplicaSet   deployment/liberty-to-openshift-instanton              Scaled up replica set liberty-to-openshift-instanton-9fc5d9dc5 to 1

In this example, you need to create the service account.

Delete the failed deployment.

oc delete -f deploy_incorrectly_placed_in_spec.yaml
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc delete -f deploy_incorrectly_placed_in_spec.yaml
deployment.apps "liberty-to-openshift-instanton" deleted
service "liberty-to-openshift-instanton" deleted
route.route.openshift.io "liberty-to-openshift-instanton" deleted

Adding to the “spec” section rather than the “containers” section

Initially I placed the serviceAccount and securityContext in the “spec” section of the deployment yaml file rather that the “containers” section of the yaml file.

In this case, I copied an example I found in github. This github example used the OpenLibertyApplication deployment type (Open Liberty Operator). I’m not sure whether this has any bearing on this error – I am yet to try out.

The following deployment yaml file highlights what I did.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: liberty-to-openshift-instanton
  labels:
    app: liberty-to-openshift-instanton
spec:
  selector:
    matchLabels:
      app: liberty-to-openshift-instanton
  template:
    metadata:
      labels:
        app: liberty-to-openshift-instanton
    spec:
      # Add InstantOn - this is in the wrong spot, it should be under container
      serviceAccountName: liberty-to-openshift-instanton
      securityContext:
        runAsNonRoot: true
        privileged: false
        allowPrivilegeEscalation: true
        capabilities:
          add:
            - CHECKPOINT_RESTORE
            - SETPCAP
          drop:
            - ALL
      # Add InstantOn
      containers:
      - name: libertytoopenshiftinstanton-container
        image: image-registry.openshift-image-registry.svc:5000/liberty-to-openshift-instanton/liberty-to-openshift-instanton:olp-java17-1.0
        ports:
        - containerPort: 9080

Create the Service Account if it does not already exist.

oc create serviceaccount liberty-to-openshift-instanton-sa
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc create serviceaccount liberty-to-openshift-instanton-sa
serviceaccount/liberty-to-openshift-instanton-sa created

This example assumes the Security Context Constraint also exists with the correct role binding and that the only problem is with the deployment yaml fie.

Switch to the kubeadmin user.

oc login -u kubeadmin https://api.crc.testing:6443
Command output
c:\ocp\LibertyToOpenShiftInstantOn>oc login -u kubeadmin https://api.crc.testing:6443
Logged into "https://api.crc.testing:6443" as "kubeadmin" using existing credentials.

You have access to 71 projects, the list has been suppressed. You can list all projects with 'oc projects'

Using project "liberty-to-openshift-instanton".

Ensure you are connected to the correct project.

oc project liberty-to-openshift-instanton
Command output
C:\ocp\LibertyToOpenShiftInstantOn>oc project liberty-to-openshift-instanton
Already on project "liberty-to-openshift-instanton" on server "https://api.crc.testing:6443".

Now create the SCC.

oc apply -f scc-cap-cr-minmal.yaml
Command output
c:\ocp\LibertyToOpenShiftInstantOn>oc apply -f scc-cap-cr-minmal.yaml
securitycontextconstraints.security.openshift.io/liberty-to-openshift-instanton created

Verify the SCC was created. Similar to the Service Account (SA), I have added a ‘-scc’ suffix to help distinguish between the SA.

oc get scc liberty-to-openshift-instanton-scc
Command output
c:\ocp\LibertyToOpenShiftInstantOn>oc get scc liberty-to-openshift-instanton-scc
NAME                                 PRIV    CAPS         SELINUX     RUNASUSER          FSGROUP    SUPGROUP   PRIORITY     READONLYROOTFS   VOLUMES
liberty-to-openshift-instanton-scc   false   <no value>   MustRunAs   MustRunAsNonRoot   RunAsAny   RunAsAny   <no value>   false            ["awsElasticBlockStore","azureDisk","azureFile","cephFS","cinder","configMap","csi","downwardAPI","emptyDir","ephemeral","fc","flexVolume","flocker","gcePersistentDisk","gitRepo","glusterfs","iscsi","nfs","persistentVolumeClaim","photonPersistentDisk","portworxVolume","projected","quobyte","rbd","scaleIO","secret","storageOS","vsphere"]

Create the role-based mapping between the SCC and the service account.

oc adm policy add-scc-to-user liberty-to-openshift-instanton-scc -z liberty-to-openshift-instanton-sa
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc adm policy add-scc-to-user liberty-to-openshift-instanton-scc -z liberty-to-openshift-instanton-sa
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:liberty-to-openshift-instanton-scc added: "liberty-to-openshift-instanton-sa"

Switch back to the developer account. All remaining steps should be created under this account.

oc login -u developer https://api.crc.testing:6443

All the pre-requisites should now be created:

  • Service Account (SA)
  • Security Context Constraint (SCC)
  • Role binding between the SA and SCC

Deploy the app using the following file.

oc apply -f deploy_incorrectly_placed_in_spec.yaml
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc apply -f deploy_incorrectly_placed_in_spec.yaml
deployment.apps/liberty-to-openshift-instanton created
service/liberty-to-openshift-instanton created
route.route.openshift.io/liberty-to-openshift-instanton created

Check for any running pods.

oc get po

The output shows that the “liberty-to-openshift-instanton” pod is running. All looks good so far.

Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc get po
NAME                                              READY   STATUS    RESTARTS   AGE
liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd   1/1     Running   0          9s

Now check the events for any errors. All still looks ok.

oc get events
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc get events
LAST SEEN   TYPE     REASON              OBJECT                                                 MESSAGE
17s         Normal   Scheduled           pod/liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd    Successfully assigned liberty-to-openshift-instanton/liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd to crc-vlf7c-master-0
14s         Normal   AddedInterface      pod/liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd    Add eth0 [10.217.0.70/23] from openshift-sdn
14s         Normal   Pulled              pod/liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd    Container image "image-registry.openshift-image-registry.svc:5000/liberty-to-openshift-instanton/liberty-to-openshift-instanton:olp-java17-1.0" already present on machine
14s         Normal   Created             pod/liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd    Created container libertytoopenshiftinstanton-container
14s         Normal   Started             pod/liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd    Started container libertytoopenshiftinstanton-container
17s         Normal   SuccessfulCreate    replicaset/liberty-to-openshift-instanton-7bb5b9ccd9   Created pod: liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd
17s         Normal   ScalingReplicaSet   deployment/liberty-to-openshift-instanton              Scaled up replica set liberty-to-openshift-instanton-7bb5b9ccd9 to 1

The final test is to check the logs of the running pod. Now you’ll see the error.

Issue the command to check the logs.

oc logs liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd

The following shows the error you’ll receive. All I can say is good luck working this error out.

This took me much effort to fix and involved me seeking assistance from an expert in using Open Liberty InstantOn.

c:\ocp\LibertyToOpenShiftInstantOn>oc logs liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd

/opt/ol/wlp/bin/server: line 1373: /opt/criu/criu: Operation not permitted
CWWKE0961I: Restoring the checkpoint server process failed. Check the /logs/checkpoint/restore.log log to determine why the checkpoint process was not restored. Launching the server without using the checkpoint image.
Launching defaultServer (Open Liberty 24.0.0.8/wlp-1.0.92.cl240820240729-1903) on Eclipse OpenJ9 VM, version 17.0.12+7 (en_US)
[AUDIT   ] CWWKE0001I: The server defaultServer has been launched.
[AUDIT   ] CWWKG0093A: Processing configuration drop-ins resource: /opt/ol/wlp/usr/servers/defaultServer/configDropins/defaults/keystore.xml
[AUDIT   ] CWWKG0093A: Processing configuration drop-ins resource: /opt/ol/wlp/usr/servers/defaultServer/configDropins/defaults/open-default-port.xml
[AUDIT   ] CWWKZ0058I: Monitoring dropins for applications.
[AUDIT   ] CWWKT0016I: Web application available (default_host): http://liberty-to-openshift-instanton-7bb5b9ccd9-kt2vd:9080/
[AUDIT   ] CWWKZ0001I: Application LibertyToOpenShiftInstantOn started in 0.240 seconds.
[AUDIT   ] CWWKF0012I: The server installed the following features: [el-3.0, jsp-2.3, localConnector-1.0, servlet-4.0].
[AUDIT   ] CWWKF0011I: The defaultServer server is ready to run a smarter planet. The defaultServer server started in 1.793 seconds.

In this particular case, check your deployment yaml file and place the serviceAccount and securityConstraint in the correct location.

As can be seen below, I have moved the serviceAccountName and securityContext under the “containers” section.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: liberty-to-openshift-instanton
  labels:
    app: liberty-to-openshift-instanton
spec:
  selector:
    matchLabels:
      app: liberty-to-openshift-instanton
  template:
    metadata:
      labels:
        app: liberty-to-openshift-instanton
    spec:
      containers:
      - name: liberty-to-openshift-instanton
        image: image-registry.openshift-image-registry.svc:5000/liberty-to-openshift-instanton/liberty-to-openshift-instanton:olp-java17-1.0
      # Add InstantOn
        serviceAccountName: liberty-to-openshift-instanton-sa
        securityContext:
          runAsNonRoot: true
          privileged: false
          allowPrivilegeEscalation: true
          capabilities:
            add:
              - CHECKPOINT_RESTORE
              - SETPCAP
            drop:
              - ALL
        # Add InstantOn
        ports:
        - containerPort: 9080

To test this, tune the following command. This deployment yaml file has these fields in the correct location.

oc apply -f deploy.yaml

If you check the output, you’ll notice the following change.

c:\ocp\LibertyToOpenShiftInstantOn>oc apply -f deploy.yaml
Warning: would violate PodSecurity "baseline:v1.24": non-default capabilities (container "liberty-to-openshift-instanton" must not include "CHECKPOINT_RESTORE" in securityContext.capabilities.add)
deployment.apps/liberty-to-openshift-instanton configured
service/liberty-to-openshift-instanton unchanged
route.route.openshift.io/liberty-to-openshift-instanton unchanged

If you check the running pods

oc get po

And the events

oc get events

All looks the same as before.

But when you check the logs, you’ll see the checkpoint restore was successful. Also note the significantly faster startup time.

oc logs liberty-to-openshift-instanton-6d4764cdcd-5mdbn
c:\ocp\LibertyToOpenShiftInstantOn>oc logs liberty-to-openshift-instanton-6d4764cdcd-5mdbn

JVMJITM043W AOT load and compilation disabled post restore.
[AUDIT   ] Launching defaultServer (Open Liberty 24.0.0.8/wlp-1.0.92.cl240820240729-1903) on Eclipse OpenJ9 VM, version 17.0.12+7 (en_US)
[AUDIT   ] CWWKT0016I: Web application available (default_host): http://liberty-to-openshift-instanton-6d4764cdcd-5mdbn:9080/
[AUDIT   ] CWWKC0452I: The Liberty server process resumed operation from a checkpoint in 0.263 seconds.
[AUDIT   ] CWWKZ0001I: Application LibertyToOpenShiftInstantOn started in 0.267 seconds.
[AUDIT   ] CWWKF0012I: The server installed the following features: [el-3.0, jsp-2.3, localConnector-1.0, servlet-4.0].
[AUDIT   ] CWWKF0011I: The defaultServer server is ready to run a smarter planet. The defaultServer server started in 0.422 seconds.

You have now successfully simulated the error when the serviceAccount and securityContext are placed in the wrong section in the deployment yaml file, plus the solution to the problem.

I hope this saves you the many hours it cost me by this simple mistake.

Delete the deployment.

oc delete -f deploy.yaml
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc delete -f deploy_incorrectly_placed_in_spec.yaml
deployment.apps "liberty-to-openshift-instanton" deleted
service "liberty-to-openshift-instanton" deleted
route.route.openshift.io "liberty-to-openshift-instanton" deleted

You should also cleanup the environment by:

  • Deleting the SCC to SA role binding
  • Deleting the SCC
  • Deleting the SA

Tips

Delete the events while testing

If you are testing these errors using OpenShift Local or in an OpenShift test environment, you can delete the events before each test. This makes identifying the actual error a little easier.

Issue the following command to delete all events for your connected project.

oc delete events --all
Command output.
c:\ocp\LibertyToOpenShiftInstantOn>oc delete events --all
event "liberty-to-openshift-instanton-9fc5d9dc5.17f443c823ea665d" deleted
event "liberty-to-openshift-instanton-9fc5d9dc5.17f49810b2cfcd8e" deleted

Sean Boyd Avatar

Leave a Reply

Your email address will not be published. Required fields are marked *