Running Axon Server in Docker

I showed you how to run Axon Server locally and configure it for secure operations in my previous post. We also looked at the possibilities for configuring storage locations. This time around, we’ll look at running it in Docker, both using the public image up on Docker Hub as well as with a locally built image, and why you might want to do that.

Note We now have a repository live on GitHub with scripts, configuration files, and deployment descriptors. You can find it here.

Axon Server in a container

Running Axon Server in a container is actually pretty simple using the provided image, with a few predictable gotchas. Let’s start with a simple test:


    $ docker run axoniq/axonserver
    Unable to find image 'axoniq/axonserver:latest' locally
    latest: Pulling from axoniq/axonserver
    9ff2acc3204b: Pull complete
    69e2f037cdb3: Pull complete
    3e010093287c: Pull complete
    3aaf8fbd9150: Pull complete
    1a945471328b: Pull complete
    1a3fb0c2d12b: Pull complete
    cb60bf4e2607: Pull complete
    1ce42d85789e: Pull complete
    b400281f4b04: Pull complete
    Digest: sha256:514c56bb1a30d69c0c3e18f630a7d45f2dca1792ee7801b7f0b7c22acca56e17
    Status: Downloaded newer image for axoniq/axonserver:latest
         _                     ____
        / \   __  _____  _ __ / ___|  ___ _ ____   _____ _ __
       / _ \  \ \/ / _ \| '_ \\___ \ / _ \ '__\ \ / / _ \ '__|
      / ___ \  >  < (_) | | | |___) |  __/ |   \ V /  __/ |
     /_/   \_\/_/\_\___/|_| |_|____/ \___|_|    \_/ \___|_|
     Standard Edition                        Powered by AxonIQ
    
    version: 4.3
    2020-02-27 13:45:40.156  INFO 1 --- [           main] io.axoniq.axonserver.AxonServer          : Starting AxonServer on c23aa95bb8ec with PID 1 (/app/classes started by root in /)
    2020-02-27 13:45:40.162  INFO 1 --- [           main] io.axoniq.axonserver.AxonServer          : No active profile set, falling back to default profiles: default
    2020-02-27 13:45:44.523  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with port(s): 8024 (http)
    2020-02-27 13:45:44.924  INFO 1 --- [           main] A.i.a.a.c.MessagingPlatformConfiguration : Configuration initialized with SSL DISABLED and access control DISABLED.
    2020-02-27 13:45:49.453  INFO 1 --- [           main] io.axoniq.axonserver.AxonServer          : Axon Server version 4.3
    2020-02-27 13:45:53.414  INFO 1 --- [           main] io.axoniq.axonserver.grpc.Gateway        : Axon Server Gateway started on port: 8124 - no SSL
    2020-02-27 13:45:54.070  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 8024 (http) with context path ''
    2020-02-27 13:45:54.075  INFO 1 --- [           main] io.axoniq.axonserver.AxonServer          : Started AxonServer in 15.027 seconds (JVM running for 15.942)    

When we see that last line, we open a second window and query the REST API:


    $ curl -s http://localhost:8024/v1/public/me
      curl: (7) Failed to connect to localhost port 8024: Connection refused
    $

Ok, anyone who ever ran Docker containers before would see that coming: the container may have announced that ports 8024 and 8124 are to be exposed, but that was just a statement of intent. So we ^C ourselves out of here and add “-p 8024:8024 -p 8124:8124”. On the Axon Server-side, nothing looks different, but now we can get access:


    $ curl -s http://localhost:8024/v1/public/me | jq
        {
        "authentication": false,
        "clustered": false,
        "ssl": false,
        "adminNode": true,
        "developmentMode": false,
        "storageContextNames": [
            "default"
        ],
        "contextNames": [
            "default"
        ],
        "name": "87c201162360",
        "hostName": "87c201162360",
        "grpcPort": 8124,
        "httpPort": 8024,
        "internalHostName": null,
        "grpcInternalPort": 0
        }
    $

As discussed last time, having a node name “87c201162360” is no problem, but the hostname will be, as a client application will default follow Axon Server’s request to switch to that name without question and fail to find it. Thus, we can reconfigure Axon Server without much trouble, but let me start by telling a bit about the container’s structure. It was made using Axon Server SE, which is Open Source and can be found here. In this case, the container is built using a compact image from Google’s “distroless” base images at the gcr.io repository, “gcr.io/distroless/java:11”. Finally, the application itself is installed in the root with a minimal properties file:


    >
    axoniq.axonserver.event.storage=/eventdata
    axoniq.axonserver.snapshot.storage=/eventdata
    axoniq.axonserver.controldb-path=/data
    axoniq.axonserver.pid-file-location=/data
    logging.file=/data/axonserver.log
    logging.file.max-history=10
    logging.file.max-size=10MB

The “/data” and “/eventdata” directories are created as volumes, and their data will be accessible on your local filesystem somewhere in Docker’s temporary storage tree. Alternatively, you can tell docker to use a specific directory, which will allow you to put it at a more convenient location. A third directory, not marked as a volume in the image, is important for our case: If you put an “axonserver.properties” file in “/config”, it can override the settings above and add new ones:


    $ mkdir -p axonserver/data axonserver/events axonserver/config
    $ (
    > echo axoniq.axonserver.name=axonserver
    > echo axoniq.axonserver.hostname=localhost
    > ) > axonserver/config/axonserver.properties
    $ docker run -d --rm --name axonserver -p 8024:8024 -p 8124:8124 \
    > -v `pwd`/axonserver/data:/data \
    > -v `pwd`/axonserver/events:/eventdata \
    > -v `pwd`/axonserver/config:/config \
    > axoniq/axonserver
    4397334283d6185506ad27a024fbae91c5d2918e1314d19fcaf2dc20b4e400cb
    $

Now, if you query the API (either using the “docker logs” command to verify startup has finished or simply repeating the “curl” command until it responds), it will show that it is running with the name “axonserver” and hostname “localhost”. Also, if you look at the data directory, you will see the ControlDB file, PID file, and a copy of the log output, while the “events” directory will have the event and snapshot data.

From Docker to docker-compose

Running Axon Server in a Docker container has several advantages, the most important of which is the compact distribution format: With one single command, we have installed and started Axon Server, and it will always work in the same, predictable fashion. You will most likely use this for local development and demonstration scenarios, as well as for tests of Axon Framework client applications. That said, Axon Server is mainly targeted at a distributed usage scenario, where you have several application components exchanging Events, Commands, and Queries. For this, you will more likely employ docker-compose or larger-scale infrastructural products such as Kubernetes, Cloud-Foundry, and Red Hat OpenShift.

To start with docker-compose, the following allows you to start Axon Server with “./data”, “./events”, and “./config” mounted as volumes, where the config directory is actually Read-Only.

Note: This has been tested on MacOS and Linux. On Windows 10, named volume mapping using the “local” driver will not work, so you need to remove the “driver” and “driver_opts” sections in the file below. The new Windows Subsystem for Linux (WSL version 2) combined with the Docker Desktop based on it will hopefully bring relief. Still, for the moment, you will not be able to use volumes in docker-compose and then access the files from the host on Windows.


	version: '3.3'
	services:
  		axonserver:
    		image: axoniq/axonserver
    		hostname: axonserver
    		volumes:
      				- axonserver-data:/data
      				- axonserver-events:/eventdata
      				- axonserver-config:/config:ro
    		ports:
      				- '8024:8024'
      				- '8124:8124'
      				- '8224:8224'
    		networks:
      				- axon-demo

	volumes:
 		axonserver-data:
  		axonserver-events:
  		axonserver-config:

	volumes:
  		axonserver-data:
   			driver: local
    		driver_opts:
      				type: none
      				device: ${PWD}/data
     	 			o: bind
  		axonserver-events:
    		driver: local
    		driver_opts:
      				type: none
      				device: ${PWD}/events
     				o: bind
  		axonserver-config:
    		driver: local
    		driver_opts:
      				type: none
      				device: ${PWD}/config
      				o: bind

	networks:
  		axon-demo:

This also sets the container’s hostname to “axonserver”, so all you need to add is an “axonserver. properties” file:


    $ echo “axoniq.axonserver.hostname=localhost” > config/axonserver.properties
    $ docker-compose up
    Creating network "docker-compose_axon-demo" with the default driver
    Creating volume "docker-compose_axonserver-data" with local driver
    Creating volume "docker-compose_axonserver-events" with local driver
    Creating volume "docker-compose_axonserver-config" with local driver
    Creating docker-compose_axonserver_1 ... done
    Attaching to docker-compose_axonserver_1
    axonserver_1  |      _                     ____
    axonserver_1  |     / \   __  _____  _ __ / ___|  ___ _ ____   _____ _ __
    axonserver_1  |    / _ \  \ \/ / _ \| '_ \\___ \ / _ \ '__\ \ / / _ \ '__|
    axonserver_1  |   / ___ \  >  < (_) | | | |___) |  __/ |   \ V /  __/ |
    axonserver_1  |  /_/   \_\/_/\_\___/|_| |_|____/ \___|_|    \_/ \___|_|
    axonserver_1  |  Standard Edition                        Powered by AxonIQ
    axonserver_1  |
    axonserver_1  | version: 4.3
    axonserver_1  | 2020-03-10 13:17:26.134  INFO 1 --- [           main] io.axoniq.axonserver.AxonServer          : Starting AxonServer on axonserver with PID 1 (/app/classes started by root in /)
    axonserver_1  | 2020-03-10 13:17:26.143  INFO 1 --- [           main] io.axoniq.axonserver.AxonServer          : No active profile set, falling back to default profiles: default
    axonserver_1  | 2020-03-10 13:17:32.383  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with port(s): 8024 (http)
    axonserver_1  | 2020-03-10 13:17:32.874  INFO 1 --- [           main] A.i.a.a.c.MessagingPlatformConfiguration : Configuration initialized with SSL DISABLED and access control DISABLED.
    axonserver_1  | 2020-03-10 13:17:38.741  INFO 1 --- [           main] io.axoniq.axonserver.AxonServer          : Axon Server version 4.3
    axonserver_1  | 2020-03-10 13:17:43.586  INFO 1 --- [           main] io.axoniq.axonserver.grpc.Gateway        : Axon Server Gateway started on port: 8124 - no SSL
    axonserver_1  | 2020-03-10 13:17:44.341  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 8024 (http) with context path ''
    axonserver_1  | 2020-03-10 13:17:44.349  INFO 1 --- [           main] io.axoniq.axonserver.AxonServer          : Started AxonServer in 19.86 seconds (JVM running for 21.545)

Now you have it running locally, with a fresh and predictable environment and easy access to the properties file. Also, as long as you leave the “data” and “events” directories untouched, you will get the same event store over subsequent runs, while cleaning them is simply a matter of removing and recreating those directories.

Differences with Axon Server EE

To extend this docker-compose setup to Axon Server EE, we first need to build an image, as there is no public image. Using the same approach as with SE, however, it is a relatively simple one that will work for OpenShift, Kubernetes, and Docker and docker-compose. Also, we can be a bit more security conscious and run Axon Server as a non-root user. Unfortunately, this last bit forces the usage of a two-stage Dockerfile since the Google “distroless” images do not contain a shell, and we want to run a few commands:


    FROM busybox as source
    RUN addgroup -S axonserver \
        && adduser -S -h /axonserver -D axonserver \
        && mkdir -p /axonserver/config /axonserver/data \
                    /axonserver/events /axonserver/log \
        && chown -R axonserver:axonserver /axonserver
    
    FROM gcr.io/distroless/java:11
    
    COPY --from=source /etc/passwd /etc/group /etc/
    COPY --from=source --chown=axonserver /axonserver /axonserver
    
    COPY --chown=axonserver axonserver.jar axonserver.properties /axonserver/
    
    USER axonserver
    WORKDIR /axonserver
    
    VOLUME [ "/axonserver/config", "/axonserver/data", "/axonserver/events", "/axonserver/log" ]
    EXPOSE 8024/tcp 8124/tcp 8224/tcp
    
    ENTRYPOINT [ "java", "-jar", "axonserver.jar" ]

The first stage creates a user and group named “axonserver” and the directories that will become our volumes and finally set the ownership. The second stage begins by copying the account (in the form of the “passwd” and “group” files) and the home directory with its volume mount points, carefully keeping ownership set to the new user. The last steps are the “regular” steps, copying the executable jar and a common set of properties, marking the volume mounting points and exposed ports, and specifying the command to start Axon Server.

For the common properties, we’ll use just enough to make it use our volume mounts and add a log file for good measure:


    axoniq.axonserver.event.storage=/axonserver/events
    axoniq.axonserver.snapshot.storage=/axonserver/events
    axoniq.axonserver.replication.log-storage-folder=/axonserver/log
    axoniq.axonserver.controldb-path=/axonserver/data
    axoniq.axonserver.pid-file-location=/axonserver/data
    
    logging.file=/axonserver/data/axonserver.log
    logging.file.max-history=10
    logging.file.max-size=10MB

You can build this image, push it to your local repository, or keep it local if you only want to run it on your development machine. On the docker-compose side, we can now specify three instances of the same container image, using separate volumes for “data”, “events”, and “log”, but we haven’t provided it yet with a license file and token. So instead, we’ll use secrets for that:


    [...services, volumes, and networks sections skipped…]
    secrets:
      axonserver-properties:
        file: ./axonserver.properties
      axoniq-license:
        file: ./axoniq.license
      axonserver-token:
        file: ./axonserver.token

All three files will be placed in the “config” directory using a “secrets” section in the service definition, with an environment variable added to tell Axon Server about the location of the license file. As an example, here is the resulting definition of the first node’s service:


    axonserver-1:
    image: axonserver-ee:test
    hostname: axonserver-1
    volumes:
      - axonserver-data1:/axonserver/data
      - axonserver-events1:/axonserver/events
      - axonserver-log1:/axonserver/log
    secrets:
      - source: axoniq-license
        target: /axonserver/config/axoniq.license
      - source: axonserver-properties
        target: /axonserver/config/axonserver.properties
      - source: axonserver-token
        target: /axonserver/config/axonserver.token
    environment:
      - AXONIQ_LICENSE=/axonserver/config/axoniq.license
    ports:
      - '8024:8024'
      - '8124:8124'
      - '8224:8224'
    networks:
      - axon-demo

Note that for “axonserver-2” and “axonserver-3”, you’ll have to adjust the port definitions, for example, using “8025:8024” and “8026:8024” for the first port, to prevent all three trying to claim the same host-port. The properties file referred to in the secrets’ definition section is:


    axoniq.axonserver.autocluster.first=axonserver-1
    axoniq.axonserver.autocluster.contexts=_admin,default
    axoniq.axonserver.accesscontrol.enabled=true
    axoniq.axonserver.accesscontrol.internal-token=2843a447-4da5-4b54-af27-7a8e0d857e87
    axoniq.axonserver.accesscontrol.systemtokenfile=/axonserver/config/axonserver.token

Like last time we enable auto-clustering and access control, with a generated token for the communication between nodes. The “axonserver-token” secret is used to allow the CLI to talk with nodes. A similar approach can be used to configure more secrets for the certificates and so enable SSL.

Kubernetes and StatefulSets

Kubernetes has quickly become the “de facto” solution for running containerized applications on distributed infrastructure. Thanks to the API-first approach, it allows for flexible deployments using modern “infrastructure as code” design patterns. Due to the tight integration possible with Continuous Integration pipelines, it is also perfect for “ephemeral infrastructure” testing, with complete environments set up and torn down with minimal work. It is also the go-to platform for microservices architectures, and I think I have collected enough bonus points in the buzzword bingo with it.

All jokes aside, many of our customers deploy their applications on Kubernetes clusters, and we get regular questions about “the best way” to run Axon Server on it. Of course, with a platform like Kubernetes, you’ll find that there are a lot of customization points. Still, they are all subject to the underlying deployment model, which is (preferably) that of a horizontally scalable and stateless (micro-)service, where the lifecycle is easily automatable. For Axon Server Standard Edition, scalability is vertical, as it has no concept of a clustered deployment. Even stronger, a running Axon Server instance is definitely stateful due to the event store. So, for now, let’s focus on the most important aspect of a Kubernetes deployment of Axon Server: fixing the server’s identity and persistence using StatefulSets.

As stated before, an Axon Server instance has a clear and persistent identity. It saves identifying information about itself and (in the case of Axon Server EE) other nodes in the cluster in the controlDB. Also, if it is used as an event store, the context’s events will be stored on disk as well, and whereas a client application can survive restarts and version upgrades by rereading the events, Axon Server is the one providing those. In the context of Kubernetes, that means we want to bind every Axon Server deployment to its own storage volumes and predictable network identity. Kubernetes provides us with a StatefulSet deployment class that does just that, with the guarantee that it will preserve the automatically allocated volume claims even if migrated to another (k8s) node.

The welcome package downloaded in part one includes an example YAML descriptor for Axon Server, which I have included below (with just minor differences):


    apiVersion: apps/v1
	kind: StatefulSet
	metadata:
	  name: axonserver
	  labels:
		app: axonserver
	spec:
	  serviceName: axonserver
	  replicas: 1
	  selector:
		matchLabels:
		  app: axonserver
	  template:
		metadata:
		  labels:
			app: axonserver
		spec:
		  containers:
		  - name: axonserver
			image: axoniq/axonserver
			imagePullPolicy: Always
			ports:
			- name: grpc
			  containerPort: 8124
			  protocol: TCP
			- name: http
			  containerPort: 8024
			  protocol: TCP
			volumeMounts:
			- name: eventstore
			  mountPath: /eventdata
			- name: data
			  mountPath: /data
			readinessProbe:
			  httpGet:
				port: http
				path: /actuator/info
			  initialDelaySeconds: 30
			  periodSeconds: 5
			  timeoutSeconds: 1
			livenessProbe:
			  httpGet:
				port: gui
				path: /actuator/info
			  initialDelaySeconds: 60
			  periodSeconds: 5
			  timeoutSeconds: 1
	  volumeClaimTemplates:
		- metadata:
			name: eventstore
		  spec:
			accessModes: [ "ReadWriteOnce" ]
			resources:
			  requests:
				storage: 5Gi
		- metadata:
			name: data
		  spec:
			accessModes: [ "ReadWriteOnce" ]
			resources:
			  requests:
				storage: 1Gi

 

Highlighted in the listing above are two important lines for Axon Server SE. The first tells Kubernetes we want only a single instance, the second referring to the SE container image. Important to note here is that this is a pretty basic descriptor in the sense that it does not have any settings for the amount of memory and/or CPU to reserve for Axon Server, which you may want to do for long-running deployments, and it “just” claims 5GiB of disk space for the Event Store. Also, we have not yet provided any means of adjusting the configuration. To complete this, we need to add Service definitions that expose the two ports:


	apiVersion: v1
	kind: Service
	metadata:
	  name: axonserver-gui
	  labels:
		app: axonserver
	spec:
	  ports:
	  - name: gui
		port: 8024
		targetPort: 8024
	  selector:
		app: axonserver
	  type: LoadBalancer
	  sessionAffinity: clientIP
	---
	apiVersion: v1
	kind: Service
	metadata:
	  name: axonserver-grpc
	  labels:
		app: axonserver
	spec:
	  ports:
	  - name: grpc
		port: 8124
		targetPort: 8124
	  clusterIP: None
	  selector:
		app: axonserver

Now you’ll notice the HTTP port is exposed using a LoadBalancer. In contrast, the Service for the gRPC port has a defaulted type of “ClusterIP” with “clusterIP” set to “none”, making it (in Kubernetes terminology) a Headless Service. This is important because a StatefulSet needs at least one Headless Service to enable DNS exposure within the Kubernetes namespace. Additionally, client applications will use long-living connections to the gRPC port and are expected to explicitly connect to a specific node. Apart from that, the deployment model for the client applications is probably what brought you to Kubernetes in the first place, making an externally accessible interface less of a requirement. The client applications will be deployed in their own namespace and can connect to Axon Server using k8s internal DNS.

The elements in the DNS name are (from left to right):

  • The name of the StatefulSet, a dash, and a sequence number (starting at 0). You’ll recognize this as the Pod’s name.
  • The name of the service.
  • The name of the namespace. (“default” if unspecified)
  • “svc.cluster.local”.

If you want to deploy Axon Server in Kubernetes but run client applications outside of it, you actually can use a “LoadBalancer” type service since gRPC uses HTTP/2. Still, you will need to fix it to the specific pod using the “statefulset.kubernetes.io/pod-name” selector and the Pod’s name as value and repeat for all nodes. However, as this is not recommended practice, we’ll not go into that.

Differences with Axon Server EE

There are several ways we can deploy a cluster of Axon Server EE nodes to Kubernetes. The simplest approach, and most often correct one, is to use a scaling factor other than 1, letting Kubernetes take care of deploying several instances. This means we will get several nodes that Kubernetes can dynamically manage and migrate as needed while at the same time fixing the name and storage. As we saw with SE, you’ll get a number suffixed to the name starting at 0, so a scaling factor of 3 gives us “axonserver-0” through “axonserver-2”. Of course, we still need a secret to add the license file. Still, when we try to add this to our mix, we run into a big difference with docker-compose: Kubernetes mounts Secrets and ConfigMaps as directories rather than files, so we need to split license and configuration to two separate locations. We can use a new location “/axonserver/license/axoniq.license,” and adjust the environment variable to match the license secret. We’ll use “/axonserver/security/token.txt” for the system token, and for the properties file, we’ll use a ConfigMap that we mount on top of the “/axonserver/config” directory. We can create them directly from their respective files:


    $ kubectl create secret generic axonserver-license --from-file=./axoniq.license
    secret/axonserver-license created
    $ kubectl create secret generic axonserver-token --from-file=./axoniq.token
    secret/axonserver-token created
    $ kubectl create configmap axonserver-properties --from-file=./axonserver.properties
    configmap/axonserver-properties created
    $ 

In the descriptor, we now have to announce the secret, add a volume for it, and mount the secret on the volume:


    volumeMounts:
        - name: eventstore
          mountPath: /eventdata
        - name: data
          mountPath: /data

Becomes:


    env:
        - name: AXONIQ_LICENSE
          value: "/axonserver/license/axoniq.license"
        volumeMounts:
        - name: data
          mountPath: /axonserver/data
        - name: events
          mountPath: /axonserver/events
        - name: log
          mountPath: /axonserver/log
        - name: config
          mountPath: /axonserver/config
          readOnly: true
        - name: system-token
          mountPath: /axonserver/security
          readOnly: true
        - name: license
          mountPath: /axonserver/license
          readOnly: true

Then a list of volumes has to be added to link the actual license and properties:


    volumes:
    - name: config
      configMap:
        name: axonserver-properties
    - name: system-token
      secret:
        secretName: axonserver-token
    - name: license
      secret:
        secretName: axonserver-license

Arguably, the properties should also be in secret, which tightens up security on the settings in there, but I’ll leave that “as an exercise for the reader.”

Now there is only one thing left, and that has to do with the image we built for docker-compose. If we try to start the StatefulSet with 1 replica to test if everything works, you’ll find that it fails with a so-called “CrashLoopBackoff”. If you look at the logs, you’ll find that Axon Server was unable to create the controlDB, and that is odd given that it worked fine for SE. The cause is a major difference between plain Docker and Kubernetes, in that volumes are mounted as owned by the mount location’s owner in Docker. At the same time, Kubernetes uses a special security context, defaulting to root. Since our EE image runs Axon Server under its own user, it has no rights on the mounted volume other than “read”. The context can be specified, but only through the user or group’s ID, and not using their name as we did in the image because it does not exist in the k8s management context. So we have to adjust the first stage to specify a specific numeric value and then use that value in the security context:


    FROM busybox as source
    RUN addgroup -S -g 1001 axonserver \
        && adduser -S -u 1001 -h /axonserver -D axonserver \
        && mkdir -p /axonserver/config /axonserver/data \
                    /axonserver/events /axonserver/log \
        && chown -R axonserver:axonserver /axonserver

Now we have an explicit ID (1001 twice) and can add that to the StatefulSet:


    template:
        metadata:
        labels:
            app: axonserver
        spec:
        securityContext:
            runAsUser: 1001
            fsGroup: 1001
        containers:
            - name: axonserver
            image: eu.gcr.io/axoniq-devops/axonserver-ee:running
            imagePullPolicy: Always

With this change, we can finally run Axon Server successfully and scale it up to the number of nodes we want. However, when the second node comes up and tries to register itself first, another typical Kubernetes behaviour turns up when we see the logs of node “axonserver-enterprise-0” filling up with DNS lookup errors for “axonserver-enterprise-1”. This is caused by how the StatefulSet Pods get added to the DNS registry, which is not done until the readiness probe is happy. Axon Server itself is by then already busily running the auto-cluster actions, and node 0 is known even though the way back to node 1 is still unknown. In a Pod migration scenario, if, e.g., a k8s node has to be brought down for maintenance, this is exactly the behaviour we want, even though confusing when we see it here during cluster initialization and registration. If you want, you can avoid this by simply not using the auto-cluster options and doing everything by hand. Still, given that it really is a “cosmetic” issue and in no way has lasting effects, you can also ignore the errors.

Alternative Deployment models

Using the scaling factor on our StatefulSet is pretty straightforward but does have a potential disadvantage: we cannot disable (shutdown if you like) a node without disabling all higher-numbered nodes. If we decide to give the nodes different roles in the cluster, define contexts that are not available on all nodes, or bring a “middle” node down for maintenance, you run into the horizontal scaling model imposed by Kubernetes. It is possible to do individual restarts simply by killing the Pod involved, prompting Kubernetes to start a new one for it, but we cannot shut it down “until further notice”. The assumption made by StatefulSet is that each node needs its storage and identity, but they all provide the same service. If you want to reduce the scaling by one, the highest-numbered one will be taken down. Taking the whole cluster down for maintenance is easy, but that is not what we want. An alternative model is to create StatefulSets per role, with as ultimate version a collection of single-node sets. This may feel wrong from a Kubernetes perspective, but it works perfectly for Axon Server.

Storage Considerations

In the first instalments, we discussed the different storage settings we can pass to Axon Server. In the context of a Docker or Kubernetes deployment, this poses a double issue: As a kind of obvious first one, we want to ensure that the volume is persistent and that we have direct access to it to enable us to make backups. However, there is an additional issue that has to do with the implementation of the volume in that it needs to be configurable so we can extend it when needed. For Docker and docker-compose, it is quite possible to do this on Windows, but not with the easiest implementation of the “local” driver. Kubernetes on a laptop or desktop is a very different scenario, where practically all implementations use a local VM to implement the k8s node. VM cannot easily mount host directories. So while this will work for an easily created test installation, if you want a long-running setup under Windows, I would urge you to look at running it as a local installation.

In the cloud, both AWS and Google allow you to use volumes that can be extended as needed without the need for further manual adjustments. As we’ll see in the next instalment, for VMs, you will be using disks that may need formatting before they can be properly mounted. However, a newly created volume in k8s is immediately ready for use, and resizing is only a matter of using the Console or CLI to pass the change, after which the extra space will be immediately usable.

In Conclusion

In the next instalment, we’ll be moving to VMs, and touch on a bit more OS specifics as we build a solution that can be used in a CI/CD pipeline, with a setup that does not require manual changes updates and version upgrades.

The example scripts and configuration files used in this blog series are available from GitHub! Please visit GitHub to get a copy.

Bert Laverman
Senior Software Architect and Developer at AxonIQ. From hands-on software development to academic research on software reusability. Bert is a strong proponent of good software design, Agile and Lean software development, and DevOps practices.
Bert Laverman

Share: