Migrating Neo4j graph schemas in Kubernetes (Video)

#neo4j #zerodowntime tuesday, june 30, 2020

When running enterprise applications with zero-downtime, we need to be able to perform database schema migrations without disrupting active users. This is important not just for relational databases, but also for graph databases such as Neo4J, which don’t enforce a schema on write. However, it still makes sense to refactor your graph and to keep your graph data model in sync with your application. In the following video, I’ll explain how to migrate to schema versions defined by Cypher scripts which reside under version control, in a managed Kubernetes environment.

I’m using a file-based approach with Cypher migration scripts and the helpful neo4j-migrations tool in CLI mode. The tool stores the current schema version in the graph and idempotently applies the desired migrations, if they haven’t been executed for a given version before. All current migration scripts and the tooling are packaged to a Docker image from which we migrate the graph to the latest version.

The coffee-shop application will deploy and run an init container which is started from that migration Docker image, before the actual application starts. In this way, the application will always be executed against an expected schema version. We have to consider N-1 compatibility, as always when performing database schema migrations with zero downtime, which might require us to deploy multiple application versions before the migration is complete.

Try it yourself

You find the migration samples in the playground Quarkus application which has been extended with the resources which I’m showing in the video.

This is similar to what is running inside the container:

$> ls /cyphers/
V001__SchemaMasterData.cypher
V002__AddFlavorName.cypher
V003__RemoveFlavorDescription.cypher

$> ./neo4j-migrations --address <neo4j-address> \
  --password <pw> \
  --location file:///cyphers/ migrate
Applied migration 001 ("SchemaMasterData")
Applied migration 002 ("AddFlavorName")
Applied migration 003 ("RemoveFlavorDescription")
Database migrated to version 003.

We apply the migrations by running a Kubernetes init container, before the new version of the actual application is deployed. By making sure that both the old and current application version is compatible with the graph schema, we enable to migrate without a downtime.

The init container uses a similar configuration to connect to the Neo4J instances like the application container:

# [...]
      initContainers:
      - name: schema-migration
        image: sdaschner/neo4j-coffee-shop-migration:v001
        env:
        - name: NEO4J_ADDRESS
          value: "bolt://graphdb-neo4j:7687"
        - name: NEO4J_PASSWORD
          valueFrom:
            secretKeyRef:
              name: graphdb-neo4j-secrets
              key: neo4j-password

The shown examples are rather basic but provide all required scaffolding for enabling data migrations and thus zero-downtime deployments in our pipeline.

You also might want to have a look at the available APOC migration procedures in Neo4J.

As always, it’s crucial to test the changes upfront, especially with regards to the involved data, for example by deploying to a dedicated test or staging environment first and making sure the migration scripts work as expected. By making these things part of our pipeline we’re able to increase our development velocity and quality.

Further resources

Found the post useful? Subscribe to my newsletter for more free content, tips and tricks on IT & Java: