Troubleshoot OpenShift deployment issues
Learn how you can troubleshoot various issues when you deploy MOSTLY AI to an OpenShift cluster. Each of the listed issues includes a description of the problem and solution that shows how to overcome or workaround the issue.
Cannot list or patch resource
Problem
During the deployment of MOSTLY AI to an OpenShift cluster, you might see one of the errors listed below.
Error from server (Forbidden): nodes is forbidden: User "hello@mostly.ai"
cannot list resource "nodes" in API group "" at the cluster scope
Error from server (Forbidden): namespaces "mostly-ai" is forbidden:
User "hello@mostly.ai" cannot patch resource "namespaces" in
API group "" in the namespace "mostly-ai"
The errors appear, when you attempt to run the commands listed below from the OpenShift deployment guide.
oc get nodes
oc annotate --overwrite namespace mostly-ai openshift.io/sa.scc.uid-range="1000700000/10000"
Solution
Request elevated permissions for your user from your OpenShift administrator.
Violate PodSecurity
The installation error would violate PodSecurity allowPrivilegeEscalation
might indicate misconfigurations in the values.yaml
file.
Problem
When you run the helm
command to install MOSTLY AI on an OpenShift cluster, you might see the installation error similar to the one below.
W0714 14:04:39.656905 47432 warnings.go:70] would violate PodSecurity "restricted:v1.24": allowPrivilegeEscalation != false (container "mostly-*" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "mostly-*" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "mostly-*" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "mostly-*" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
W0714 14:04:39.711306 47432 warnings.go:70] would violate PodSecurity "restricted:v1.24": allowPrivilegeEscalation != false (container "mostly-psql" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "mostly-psql" must set securityContext.capabilities.drop=["ALL"])
Solution
- Verify that you have the correct storage class defined for the
CORDINATOR: pvc: storageClassName
key. To check your OpenShift storageClassNames, open your OpenShift console, and navigate to Storage > StorageClasses. The name defined forCORDINATOR: pvc: storageClassName
must match one of the entires. - Verify that you the
platform
key is set toocp
and notk8s
.values.yamlplatform: k8s # incorrect platform: ocp # correct
- Alternatively, keep in mind that this warning reflects the security constraints in your cluster. To better adapt to the, set the
platform
key toother
. Like so, the suggested security contexts of the helm charts will be ignored and the deployment will follow the ones from your cluster.values.yamlplatform: other # alternative
Failed to create resource
The error UPGRADE FAILED: failed to create resource: Secret in version "v1" cannot be handled as a Secret
might indicate an incorrect or missing secret key in the docker_secret
key in the values.yaml
file.
Problem
When you run the helm
command to deploy MOSTLY AI in an OpenShift cluster, you might see the deployment error similar to the one below.
UPGRADE FAILED: failed to create resource: Secret in version "v1" cannot be handled as a Secret: illegal base64 data at input byte 6
Make sure that you have set the Docker pull image secret in docker_secret
.
Solution
- Double-check that you have the correct secret key in
docker_secret
. - Make sure that the secret key value is not wrapped in quotation marks.
If the suggestions above do not help to resolve the issue, contact your Customer Experience Engineer.
Cannot patch "mostly-data" with PersistentVolumeClaim
The error UPGRADE FAILED: cannot patch "mostly-data" with kind PersistentVolumeClaim
might indicate intermittent installation issues.
Problem
When you run the helm
command to deploy MOSTLY AI in an OpenShift cluster, you might see an installation error similar to the one below.
Error: UPGRADE FAILED: cannot patch "mostly-data" with kind PersistentVolumeClaim: PersistentVolumeClaim "mostly-data" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims
core.PersistentVolumeClaimSpec{
... // 2 identical fields
Resources: {Requests: {s"storage": {i: {...}, s: "50Gi", Format: "BinarySI"}}},
VolumeName: "",
- StorageClassName: &"efs-sc",
+ StorageClassName: &"efs",
VolumeMode: &"Filesystem",
DataSource: nil,
DataSourceRef: nil,
}
Solution
You can work around the issue above by removing the mostly-ai
project from your OpenShift cluster, recreating it, and running the installation again.
- Delete the the
mostly-ai
project from your OpenShift cluster.oc delete namespace mostly-ai
- Create the
mostly-ai
project again.oc new-project mostly-ai
- Enter the following annotations for the pods of MOSTLY AI to be able to be scheduled in this namespace:
oc annotate --overwrite namespace mostly-ai openshift.io/sa.scc.supplemental-groups="1000700000/10000"
oc annotate --overwrite namespace mostly-ai openshift.io/sa.scc.uid-range="1000700000/10000"
- Run the deployment again with the
helm
command.helm upgrade --install mostly-ai ./mostly-combined --values ocp-values.yaml --namespace mostly-ai --create-namespace
Keycloak connection error
Problem
After you deploy to an OpenShift cluster, you can see in the console error messages related to Keycloak connection errors. Such as the ones listed below.
Keycloak connection error #1 - retrying (error: jakarta.ws.rs.ForbiddenException: HTTP 403 Forbidden)
Keycloak connection error #2 - retrying (error: jakarta.ws.rs.ForbiddenException: HTTP 403 Forbidden)
Keycloak connection error #3 - retrying (error: jakarta.ws.rs.ForbiddenException: HTTP 403 Forbidden)
Solution
- Log in to Keycloak at
https://{FULLY_QUALIFIED_DOMAIN_NAME}/auth/
. - Log in with the admin credentials as documented in Manage groups, step 2.
- Select master realm.
- Select Realm settings.
- For Require SSL, select None.
- Click Save for the changes to take effect.
The pod will initialize, create the mostly-generate
realm and the application will be then reachable.