Jet on Kubernetes
Hazelcast Jet has built-in support for Kubernetes deployments. It only takes a few configuration parameters to make Hazelcast Jet cluster in Kubernetes environments. Official Helm packages are also available to bootstrap Hazelcast Jet deployments with a single command.
Install using Helm
The easiest way to install Hazelcast Jet on Kubernetes is using Helm charts, Hazelcast Jet provides stable Helm charts for open-source and enterprise versions also for Hazelcast Jet Management Center.
Prerequisites
- Kubernetes 1.9+
- Helm CLI
Installing the Chart
You can install the latest version with default configuration values using below command:
helm install my-cluster stable/hazelcast-jet
This will create a cluster with the name my-cluster
and with default
configuration values. To change various configuration options you can
use –set key=value
:
helm install my-cluster --set cluster.memberCount=3 stable/hazelcast-jet
Or you can create a values.yaml
file which contains custom
configuration options. This file may contain custom hazelcast
and
hazelcast-jet
yaml files in it too.
helm install my-cluster -f values.yaml stable/hazelcast-jet
Uninstalling the Chart
To uninstall/delete the my-release
deployment:
helm uninstall my-release
The command removes all the Kubernetes components associated with the chart and deletes the release.
Configuration
The following table lists some of the configurable parameters of the Hazelcast Jet chart and their default values.
Parameter | Description | Default |
---|---|---|
image.repository | Hazelcast Jet Image name | hazelcast/hazelcast-jet |
image.tag | Hazelcast Jet Image tag | {VERSION} |
cluster.memberCount | Number of Hazelcast Jet members | 2 |
jet.yaml.hazelcast-jet | Hazelcast Jet Configuration (hazelcast-jet.yaml embedded into values.yaml ) | {DEFAULT_JET_YAML} |
jet.yaml.hazelcast | Hazelcast IMDG Configuration (hazelcast.yaml embedded into values.yaml ) | {DEFAULT_HAZELCAST_YAML} |
managementcenter.enabled | Turn on and off Hazelcast Jet Management Center application | true |
See stable charts repository for more information and configuration options.
Install without Helm
Hazelcast Jet provides Kubernetes-ready Docker images, these images use the Hazelcast Kubernetes plugin to discover other Hazelcast Jet members by interacting with the Kubernetes APIs. See relevant section for more details.
Role Based Access Control
To communicate with the Kubernetes APIs, create the Role
Based Access Control definition, (rbac.yaml
), with the following
content and apply it:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: default-cluster
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: view
subjects:
- kind: ServiceAccount
name: default
namespace: default
kubectl apply -f rbac.yaml
ConfigMap
Then we need to configure Hazelcast Jet to use Kubernetes Discovery to
form the cluster. Create a file named hazelcast-jet-config.yaml
with
following content and apply it. This will create a ConfigMap object.
apiVersion: v1
kind: ConfigMap
metadata:
name: hazelcast-jet-configuration
data:
hazelcast.yaml: |-
hazelcast:
network:
join:
multicast:
enabled: false
kubernetes:
enabled: true
namespace: default
service-name: hazelcast-jet-service
rest-api:
enabled: true
endpoint-groups:
HEALTH_CHECK:
enabled: true
kubectl apply -f hazelcast-jet-config.yaml
StatefulSet and Service
Now we need to create a StatefulSet and a Service which defines the
container spec. You can configure the environment options and the
cluster size here. Create a file named hazelcast-jet.yaml
with
following content and apply it.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: hazelcast-jet
labels:
app: hazelcast-jet
spec:
replicas: 2
serviceName: hazelcast-jet-service
selector:
matchLabels:
app: hazelcast-jet
template:
metadata:
labels:
app: hazelcast-jet
spec:
containers:
- name: hazelcast-jet
image: hazelcast/hazelcast-jet:latest
imagePullPolicy: IfNotPresent
ports:
- name: hazelcast-jet
containerPort: 5701
livenessProbe:
httpGet:
path: /hazelcast/health/node-state
port: 5701
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
readinessProbe:
httpGet:
path: /hazelcast/health/node-state
port: 5701
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 1
volumeMounts:
- name: hazelcast-jet-storage
mountPath: /data/hazelcast-jet
env:
- name: JAVA_OPTS
value: "-Dhazelcast.config=/data/hazelcast-jet/hazelcast.yaml"
volumes:
- name: hazelcast-jet-storage
configMap:
name: hazelcast-jet-configuration
---
apiVersion: v1
kind: Service
metadata:
name: hazelcast-jet-service
spec:
selector:
app: hazelcast-jet
ports:
- protocol: TCP
port: 5701
kubectl apply -f hazelcast-jet.yaml
After deploying it, we can check the status of pods with the following command:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hazelcast-jet-0 1/1 Running 0 2m23s
hazelcast-jet-1 1/1 Running 0 103s
Then we can verify from the logs of the pods that they formed a cluster with the following command:
$ kubectl logs hazelcast-jet-0
...
...
...
2020-03-05 10:02:44,698 INFO [c.h.i.c.ClusterService] [main] - [172.17.0.6]:5701 [dev] [4.0]
Members {size:1, ver:1} [
Member [172.17.0.6]:5701 - 03a22d3c-d88a-40bf-81b0-8f85e16acb0f this
]
2020-03-05 10:02:44,725 INFO [c.h.c.LifecycleService] [main] - [172.17.0.6]:5701 [dev] [4.0] [172.17.0.6]:5701 is STARTED
2020-03-05 10:03:20,387 INFO [c.h.i.n.t.TcpIpConnection] [hz.distracted_bartik.IO.thread-in-2] - [172.17.0.6]:5701 [dev] [4.0] Initialized new cluster connection between /172.17.0.6:5701 and /172.17.0.7:49103
2020-03-05 10:03:27,381 INFO [c.h.i.c.ClusterService] [hz.distracted_bartik.priority-generic-operation.thread-0] - [172.17.0.6]:5701 [dev] [4.0]
Members {size:2, ver:2} [
Member [172.17.0.6]:5701 - 03a22d3c-d88a-40bf-81b0-8f85e16acb0f this
Member [172.17.0.7]:5701 - a7295b91-939b-4181-acae-208145f773e6
]
Deploy Jobs
There are two different ways to submit a job to a Hazelcast Jet cluster:
- Package Job as a Docker container then let it submit itself.
- Submit Job as a JAR file from a shared
PersistentVolume
which is attached to a Pod.
For both options you need to create a ConfigMap
object for the client
(hazelcast-jet-client-config.yaml
) and apply it. Make sure that the
service-name is pointing to the service-name of the cluster we have
created above.
---
apiVersion: v1
kind: ConfigMap
metadata:
name: hazelcast-jet-client-configuration
data:
hazelcast-client.yaml: |-
hazelcast-client:
cluster-name: jet
network:
kubernetes:
enabled: true
namespace: default
service-name: hazelcast-jet-service
kubectl apply -f hazelcast-jet-client-config.yaml
Package the Job as a Docker Container
There are several tools to containerize your job, for example Jib. Jib builds Docker and OCI images for Java applications. It is available as plugins for Maven and Gradle and as a Java library. You can find a sample-project using Jib to containerize the Hazelcast Jet rolling-aggregate job.
After creating the image, we create a Kubernetes Job using the image
and client ConfigMap object. The client config is stored in a volume,
mounted to the container and passed as an argument to the jet submit
script along with the name of the JAR containing the Jet job.
Create a file named rolling-aggregation-via-docker.yaml
and apply it.
---
apiVersion: batch/v1
kind: Job
metadata:
name: rolling-aggregation
spec:
template:
spec:
containers:
- name: rolling-aggregation
image: rolling-aggregation:latest
imagePullPolicy: IfNotPresent
command: ["/bin/sh"]
args: ["-c", "jet -v -f /config/hazelcast-jet/hazelcast-client.yaml submit /rolling-aggregation-jar-with-dependencies.jar"]
volumeMounts:
- mountPath: "/config/hazelcast-jet/"
name: hazelcast-jet-config-storage
volumes:
- name: hazelcast-jet-config-storage
configMap:
name: hazelcast-jet-client-configuration
items:
- key: hazelcast-client.yaml
path: hazelcast-client.yaml
restartPolicy: OnFailure
kubectl apply -f rolling-aggregation-via-docker.yaml
Submit the Job from a Shared Persistent Volume
We will need a persistent volume attached to the Pods. The persistent
storage will contain job JAR files to be submitted to the cluster. There
are many different ways you can define and map volumes in Kubernetes. We
will create hostPath
persistent volume, which mounts a file or directory
from the host node’s filesystem into the Pod. See
official documentation
for other types of volumes.
Create a file named persistent-volume.yaml
and apply it:
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: rolling-aggregation-pv
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/home/docker/jars-pv"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: rolling-aggregation-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
This will create a persistent volume which will use the
/home/docker/jars-pv
directory as persistent volume on the kubernetes
node. We will mount the volume to the Pods later. So we need to put the
job JAR inside this directory.
For minikube
below commands will create the directory and copy the job
JAR into it.
ssh docker@$(minikube ip) -i $(minikube ssh-key) 'mkdir -p ~/jars-pv'
scp -i $(minikube ssh-key) rolling-aggregation-jar-with-dependencies.jar docker@$(minikube ip):~/jars-pv/
Now we can create the Kubernetes Job using Hazelcast Jet image and
client ConfigMap object. The client config and the copied job JAR is
stored in respective volumes, mounted to the container and passed as an
argument to the jet submit
script.
Create a file named rolling-aggregation.yaml
and apply it.
---
apiVersion: batch/v1
kind: Job
metadata:
name: rolling-aggregation
spec:
template:
spec:
containers:
- name: rolling-aggregation
image: hazelcast/hazelcast-jet:latest-snapshot
imagePullPolicy: IfNotPresent
command: ["/bin/sh"]
args: ["-c", "jet.sh -v -f /data/hazelcast-jet/hazelcast-client.yaml submit /job-jars/rolling-aggregation-jar-with-dependencies.jar"]
volumeMounts:
- mountPath: "/job-jars"
name: rolling-aggregation-pv-storage
- mountPath: "/data/hazelcast-jet/"
name: hazelcast-jet-config-storage
volumes:
- name: rolling-aggregation-pv-storage
persistentVolumeClaim:
claimName: rolling-aggregation-pv-claim
- name: hazelcast-jet-config-storage
configMap:
name: hazelcast-jet-client-configuration
items:
- key: hazelcast-client.yaml
path: hazelcast-client.yaml
restartPolicy: OnFailure
backoffLimit: 4
kubectl apply -f rolling-aggregation.yaml
Inspect Jobs
After you've run the job, you can open up Kubernetes Dashboard to see
it's status. To open Kubernetes Dashboard on minikube
run the
following command:
minikube dashboard
This will open a browser window with the Kubernetes Dashboard. Then navigate to Jobs section on left menu. You should be able to see your job running/completed successfully like below and inspect the logs if you like.
Access From Outside The Kubernetes
While it is straightforward to access the Hazelcast Jet cluster inside the Kubernetes (see Deploy Jobs for a sample client config), it is only possible with the public Internet network if you are accessing the cluster from outside.
Smart Routing Disabled
If you configure smart-routing: false
for your client, you don't need
any plugin. It's enough to expose your Hazelcast Jet cluster with a
LoadBalancer (or NodePort) service and set its IP/port as the TCP/IP
Hazelcast Jet Client configuration. Remember that if smart-routing is
disabled then all the communication happens against a single Hazelcast
Jet member.
Smart Routing Enabled
To access a Hazelcast Jet cluster with a smart client you need to perform the following steps:
- Expose each Hazelcast Jet Member Pod with a separate LoadBalancer or NodePort service (the simplest way to do it is to install Metacontroller and service-per-pod Decorator Controller)
- Configure ServiceAccount with ClusterRole having at least
get
andlist
permissions to the following resources:endpoints
,pods
,nodes
,services
- Use credentials from the created ServiceAccount in the Hazelcast Jet
Client configuration (credentials can be fetched with:
kubectl get secret <service-account-secret> -o jsonpath={.data.token} | base64 --decode
andkubectl get secret <service-account-secret> -o jsonpath={.data.ca\\.crt} | base64 --decode
)
hazelcast-client:
network:
kubernetes:
enabled: true
namespace: MY-KUBERNETES-NAMESPACE
service-name: MY-SERVICE-NAME
use-public-ip: true
kubernetes-master: https://35.226.182.228
api-token: THE-API-TOKEN
ca-certificate: |
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
Note: Hazelcast Jet Client outside Kubernetes cluster works only in the Kubernetes API mode.
Rolling Update and Scaling
Hazelcast Jet cluster is easily scalable within Kubernetes. You can use
the standard kubectl scale
command to change the cluster size. The
same applies the rolling update procedure, you can depend on the
standard Kubernetes behavior and just update the new version to your
Deployment/StatefulSet
configurations.
Note however that, by default, Hazelcast Jet does not shutdown gracefully. It means that if you suddenly terminate more than your backup-count property (1 by default), you may lose the cluster data. To prevent that from happening, set the following properties:
terminationGracePeriodSeconds
: In yourStatefulSet/Deployment
configuration; the value should be high enough to cover the data migration process-Dhazelcast.shutdownhook.policy=GRACEFUL
: In the JVM parameters-Dhazelcast.graceful.shutdown.max.wait
: In the JVM parameters; the value should be high enough to cover the data migration process
Additionally if you use Deployment (not StatefulSet), you need to set your strategy to RollingUpdate and ensure Pods are updated one by one.
All these features are already included in Hazelcast Jet Helm Charts. See Install Hazelcast Jet using Helm for more information.