mTLS in the Mesh with Redhat OpenShift Service Mesh (OSSM) - On or Off?

Overview

Introduction

One of the benefits of a service mesh is security by design, in fact it is one of the main use cases which drives adoption of the mesh in cloud based environments where there are disparate types of microservice workloads.

Although this is very powerful as one can imagine there may be cases that a workload cannot participate in mTLS handshakes using the default mesh certificates. This is due to a workload having to handle its own TLS termination or because simply it should not receive TLS traffic (performance or policy reasons). However, what are the possible security settings and when do they result in an on or off mTLS traffic setup? Like in the case of a recent customer of mine this can be a little confusing.

In this article we will explore how encryption with Mutual TLS (mTLS) is configured on or off in Red Hat OpenShift Service Mesh (OSSM) and how you can verify if the traffic is encrypted or not.

Setting up the Service Mesh controlplane

In order to setup the Red Hat Service Mesh you would need either a

The next step is to deploy in the cluster the necessary operators the service OSSM requires for the deployment and configuration of the mesh. We have prepared such a script add-operators-subscriptions-sm.sh and all you require is to login to your cluster and execute (in a linux based system) this script or alternatively copy the contents of it and execute them with the oc binary.

Finally, the OSSM operator will require a ServiceMeshControlPlane (SMCP) resource to define the characteristics of the Service Mesh (both for the controlplane and dataplane). We will create it with the following commands.

 0oc new-project istio-system (1)
 1echo "apiVersion: maistra.io/v2 (2)
 2kind: ServiceMeshControlPlane
 3metadata:
 4  name: basic
 5  namespace: istio-system
 6spec:
 7  security:
 8    dataPlane:
 9      automtls: false
10      mtls: false
11  addons:
12    grafana:
13      enabled: true
14    jaeger:
15      install:
16        storage:
17          type: Memory
18    kiali:
19      enabled: true
20    prometheus:
21      enabled: true
22  policy:
23    type: Istiod
24  profiles:
25    - default
26  telemetry:
27    type: Istiod
28  tracing:
29    sampling: 10000
30    type: Jaeger
31  version: v2.1"| oc apply -n istio-system -f -
32
33oc get smcp -n istio-system
34NAME    READY   STATUS           PROFILES      VERSION   AGE (3)
35basic   7/10    PausingInstall   ["default"]             18s
36
37oc wait --for condition=Ready -n istio-system smcp/basic --timeout 300s (4)
38servicemeshcontrolplane.maistra.io/basic condition met
  1. Creates the OCP namespace (project) where the controlplane components will be deployed in.
  2. The SMCP which defines that dataplane mTLS (mutual TLS encryption between mesh included workloads) is false. It is key to understand what the effect of this is and we will explain it below.
  3. The OSSM operator begins to configure the controlplane and when all components are ready it is ready (STATUS: ComponentsReady) to be used
  4. The OSSM operator begins to configure the controlplane and when conditions are met it is ready to be used

After applying the above we will have a Service Mesh controlplane which manages the observability stack as well as policies and configurations applied to the dataplane and we now need to define the dataplane.

Setting up the dataplane

Next apply the commands in the following link to create a namespace and deploy the bookinfo application. Once that is in place you should have all PODs in the bookinfo namespace started.

0oc get pods -n bookinfo
1NAME                              READY   STATUS    RESTARTS   AGE
2details-v1-6cd699df8c-nnbtx       2/2     Running   0          14s
3productpage-v1-5ddcb4b84f-6jlf8   2/2     Running   0          9s
4ratings-v1-bdbcc68bc-7vjtj        2/2     Running   0          13s
5reviews-v1-754ddd7b6f-d4xxj       2/2     Running   0          12s
6reviews-v2-675679877f-b78qm       2/2     Running   0          12s
7reviews-v3-79d7549c7-xdsjj        2/2     Running   0          11s

Add to each of the Deployment under bookinfo namespace the following annotation in order to register statistics by istio-proxy on TLS handshakes:

0oc patch deployment productpage-v1 -p '{"spec":{"template":{"metadata":{"annotations":{"sidecar.istio.io/statsInclusionPrefixes": "tls_inspector,listener,cluster"}}}}}' -n  bookinfo
1oc patch deployment details-v1 -p '{"spec":{"template":{"metadata":{"annotations":{"sidecar.istio.io/statsInclusionPrefixes": "tls_inspector,listener,cluster"}}}}}' -n  bookinfo
2oc patch deployment ratings-v1 -p '{"spec":{"template":{"metadata":{"annotations":{"sidecar.istio.io/statsInclusionPrefixes": "tls_inspector,listener,cluster"}}}}}' -n  bookinfo
3oc patch deployment reviews-v1 -p '{"spec":{"template":{"metadata":{"annotations":{"sidecar.istio.io/statsInclusionPrefixes": "tls_inspector,listener,cluster"}}}}}' -n  bookinfo
4oc patch deployment reviews-v2 -p '{"spec":{"template":{"metadata":{"annotations":{"sidecar.istio.io/statsInclusionPrefixes": "tls_inspector,listener,cluster"}}}}}' -n  bookinfo
5oc patch deployment reviews-v3 -p '{"spec":{"template":{"metadata":{"annotations":{"sidecar.istio.io/statsInclusionPrefixes": "tls_inspector,listener,cluster"}}}}}' -n  bookinfo

As soon as the re-deployment of the PODs completes the dataplane is ready and the following command should verify this.

0curl -s "http://$(oc get route istio-ingressgateway -o jsonpath='{.spec.host}' -n istio-system)/productpage" | grep -o "<title>.*</title>"
1<title>Simple Bookstore App</title>

Disabling mTLS for all applications

As mentioned above we need to have a clear understanding of the SMCP mTLS settings and their effect. The result of the above applied settings for mtls: false is that OSSM creates in the controlplane namespace a PeerAuthentication resource with mTLS mode set to PERMISSIVE. The effect of this configuration is that if 2 workloads participating in communication in the mesh can perform mTLS handshake then the mesh will enforce it.

0oc get peerauthentication -n istio-system
1NAME          MODE         AGE
2default       PERMISSIVE   5d18h

This is then inherited by all dataplane namespaces and this came as a surprise to one of our customers who expected that mtls: false meant no mTLS handshakes.

You can verify this whilst calling the URL

0watch curl -s "http://$(oc get route istio-ingressgateway -o jsonpath='{.spec.host}' -n istio-system)/productpage" | grep -o "<title>.*</title>"

TLS handshakes will take place between the components and statistics will be registered in the counters (see the numbers on the stats increasing executing the test-ssl-handshakes.sh script per POD).

0./test-ssl-handshakes.sh productpage-v1-556db7cbb5-x5n55	<-- HANDSHAKES TAKE PLACE
1./test-ssl-handshakes.sh details-v1-68cbd47bc5-xwf2x		<-- HANDSHAKES TAKE PLACE
2./test-ssl-handshakes.sh reviews-v1-75755d569f-z6jwf		<-- HANDSHAKES TAKE PLACE
3./test-ssl-handshakes.sh reviews-v2-86c76b84c5-xzq56		<-- HANDSHAKES TAKE PLACE
4./test-ssl-handshakes.sh reviews-v3-56cbff6b99-cfwj4		<-- HANDSHAKES TAKE PLACE
5./test-ssl-handshakes.sh ratings-v1-5f867c4bb7-7fdv8		<-- HANDSHAKES TAKE PLACE

You can further verify this by accessing the KIALI UI (url: oc get route kiali -o jsonpath='{.spec.host}' -n istio-system) App Graph and in the display drop down select Security. You should note the padlock icon appears in all the arrows between the bookinfo components.

So how can we then completely disable mTLS in the communications for the traffic in the dataplane? The answer is by applying DestinationRule which forces clients of a service (host) to not apply mTLS in the communication towards/from that host. This is what the following commands will do after you apply them.

 0echo "apiVersion: networking.istio.io/v1beta1
 1kind: DestinationRule
 2metadata:
 3  name: productpage
 4spec:
 5  host: productpage.bookinfo.svc.cluster.local
 6  trafficPolicy:
 7    tls:
 8      mode: DISABLE" |oc apply -n bookinfo -f -
 9
10echo "apiVersion: networking.istio.io/v1beta1
11kind: DestinationRule
12metadata:
13  name: productpage-2
14spec:
15  host: productpage
16  trafficPolicy:
17    tls:
18      mode: DISABLE" |oc apply -n bookinfo -f -
19
20echo "apiVersion: networking.istio.io/v1beta1
21kind: DestinationRule
22metadata:
23  name: details
24spec:
25  host: details.bookinfo.svc.cluster.local
26  trafficPolicy:
27    tls:
28      mode: DISABLE" |oc apply -n bookinfo -f -
29
30echo "apiVersion: networking.istio.io/v1beta1
31kind: DestinationRule
32metadata:
33  name: details-2
34spec:
35  host: details
36  trafficPolicy:
37    tls:
38      mode: DISABLE" |oc apply -n bookinfo -f -
39
40echo "apiVersion: networking.istio.io/v1beta1
41kind: DestinationRule
42metadata:
43  name: ratings
44spec:
45  host: ratings.bookinfo.svc.cluster.local
46  trafficPolicy:
47    tls:
48      mode: DISABLE" |oc apply -n bookinfo -f -
49
50echo "apiVersion: networking.istio.io/v1beta1
51kind: DestinationRule
52metadata:
53  name: ratings-2
54spec:
55  host: ratings
56  trafficPolicy:
57    tls:
58      mode: DISABLE" |oc apply -n bookinfo -f -
59
60echo "apiVersion: networking.istio.io/v1beta1
61kind: DestinationRule
62metadata:
63  name: reviews
64spec:
65  host: reviews.bookinfo.svc.cluster.local
66  trafficPolicy:
67    tls:
68      mode: DISABLE" |oc apply -n bookinfo -f -
69
70echo "apiVersion: networking.istio.io/v1beta1
71kind: DestinationRule
72metadata:
73  name: reviews-2
74spec:
75  host: reviews
76  trafficPolicy:
77    tls:
78      mode: DISABLE" |oc apply -n bookinfo -f -

Testing against the productpage application (curl above) we should see no TLS handshakes taking place now.

0./test-ssl-handshakes.sh productpage-v1-556db7cbb5-x5n55	<-- NO HANDSHAKES TAKE PLACE
1./test-ssl-handshakes.sh details-v1-68cbd47bc5-xwf2x		<-- NO HANDSHAKES TAKE PLACE
2./test-ssl-handshakes.sh reviews-v1-75755d569f-z6jwf		<-- NO HANDSHAKES TAKE PLACE
3./test-ssl-handshakes.sh reviews-v2-86c76b84c5-xzq56		<-- NO HANDSHAKES TAKE PLACE
4./test-ssl-handshakes.sh reviews-v3-56cbff6b99-cfwj4		<-- NO HANDSHAKES TAKE PLACE
5./test-ssl-handshakes.sh ratings-v1-5f867c4bb7-7fdv8		<-- NO HANDSHAKES TAKE PLACE

KIALI shows a similar behavior (notice no padlock on any of the connections and on the right handside unknown Principals on the from/to).

Service Mesh KIALI UI - No mTLS Security

Enforce mTLS for all applications

In the case of a set of applications that require to be excluded from mTLS the above may make sense. However, when the data is sensitive and policy needs to be very strict around security by encryption the Mesh admin can force all components to communicate via mTLS by defining in the SMCP Resource the following mTLS settings.

0 security:
1    dataPlane:
2      automtls: true
3      mtls: true

You can also follow this by accessing the SMCP Resource YAML under namespace istio-system and change the existing settings. Alternatively, re-apply the resource at Setting up the Service Mesh controlplane changing the settings for mtls: true and automtls: true. The outcome is that the PeerAuthentication resource will now have mTLS set to STRICT mode.

0oc get peerauthentication -n istio-system
1NAME                            MODE         AGE
2default                         STRICT       5d18h

The effect of this configuration is now any request to the productpage will fail.

 0$ curl -v "http://$(oc get route istio-ingressgateway -o jsonpath='{.spec.host}' -n istio-system)/productpage" | grep -o "<title>.*</title>"
 1> Host: istio-ingressgateway-istio-system.apps.cluster-e8e9.e8e9.sandbox866.opentlc.com
 2> User-Agent: curl/7.71.1
 3> Accept: */*
 4>
 5* Mark bundle as not supporting multiuse
 6< HTTP/1.1 503 Service Unavailable
 7< content-length: 95
 8< content-type: text/plain
 9< date: Wed, 23 Mar 2022 11:10:25 GMT
10< server: istio-envoy
11< set-cookie: 44371fc75fdb694d574e56e33b166cc7=619f273b9d2709119dd0b6b5b31cdc01; path=/; HttpOnly

mTLS disabled for specific application workloads

Our customer had specific workloads (elastic search, kafka streams etc.) which needed to handle their own TLS termination. In this case we wanted to maintain the STRICT policy of mTLS on all traffic except those workloads. To achieve this in the current setup apply the following to disable mTLS ONLY for the details service:

 0oc delete dr productpage -n bookinfo
 1oc delete dr productpage-2 -n bookinfo
 2oc delete dr reviews -n bookinfo
 3oc delete dr reviews-2 -n bookinfo
 4oc delete dr ratings -n bookinfo
 5oc delete dr ratings-2 -n bookinfo
 6oc delete peerauthentication default-disable -n bookinfo
 7echo "apiVersion: security.istio.io/v1beta1
 8kind: PeerAuthentication
 9metadata:
10  name: details-mtls-disable
11  namespace: bookinfo
12spec:
13  selector:
14    matchLabels:
15      app: details
16  mtls:
17    mode: DISABLE" |oc apply -n bookinfo -f -

Testing should show the following in KIALI UI where a padlock appears in all connections except the details and Principal shows content of the certs used on the from/to now) whilst you can also check the istio-proxy handshake stats once more to verify TLS handshakes do take place for all workloads except details.

Service Mesh KIALI UI - No mTLS for details service

Final thoughts around the behavior of mTLS on or off in the OSSM

  • If SMCP Resource config is set to PERMISSIVE mTLS the above additional PeerAuthentication for details is not required.
  • If SMCP Resource config is set to STRICT mTLS the above additional PeerAuthentication for details is required and if removed the result will be as follows

Service Mesh KIALI UI -  Error for STRICT mTLS when no PeerAuthentication DISABLE is defined

Conclusion

The Red Hat OpenShift Service Mesh (OSSM) will always apply mTLS encryption to the traffic and the options are if it should be STRICT or PERMISSIVE. In the STRICT case, all traffic participating workloads will be required to have the ability to adhere to this policy or be explicitly excluded through a PeerAuthentication disable policy for that service. In the PERMISSIVE scenario workloads will participate in mTLS traffic if both parties can and this can be by-passed via a DestinationRule for informing all clients of the host of a service to not initiate mTLS connection.

You can try all above examples (and more) in the servicemesh-playground repository and provide feedback.