Our entire application stack is packaged using Docker. One particular aspect of getting our application stack up and running is the initial data population. Having streamlined this, we’re able to have a complete environment up and running within minutes.

Prior to setting up seeding with Docker we often had problems with outdated examples and leftover data from previous demos. We also spent too much time manually keeping everything in sync. As our customer base grew, this was not a sustainable situation.

Using Docker we have created a user friendly tool that anyone in our organization can use, (even sales) both when resetting a demo laptop, and when getting a new on-site installation quickly up and running.

ardoq docker (1)

Now, let’s dive into the details!

We maintain all our demo workspaces, tutorials and help texts in our SaaS-production environment – this makes it easier to keep everything up-to-date. When running offline demos or setting up on-site solutions at customers, we need a way to copy data from the production environment to seed theses installations.

Prior to using Docker this was an error prone and ad-hoc activity. After moving to Docker we are now able to do this as part of every build. The process of preparing and applying a data seed includes the following steps:

  • Copy production data and clean it
  • Package and distribute the seed data
  • Import the data

Step 1 – Cleaning the database dump

To filter out other environment specific meta-data, we need to clean up the database dump before it is distributed. The core of the clean up is a throw away MongoDB Docker container. The reason we have the separate step with the throw away container is a security precaution – we don’t want to expose filtered out data in a lower Docker file system layer.

We start the container with the database dump available in a mounted volume, fire up MongoDB, run a clean up script in MongoDB, and finally dump the cleaned database to the mounted volume. The Docker container is discarded, so the only side effect of this process is the cleaned database dump.

docker seed clean (1)

Step 2 – Packaging the data for distribution

For easy distribution, we package the database dump, the binary files, the MongoDB dump, and restore tools in one Docker image.

Dockerfile:

FROM mongo:3.0.1 ADD cleaned.tar.gz /work ADD attachments.tar.gz /work

docker seed packaging (4)

 

Step 3 – Distribution

Once the data is packaged along with the MongoDB client, we upload the image to our private account on Docker Hub. Right now this is triggered manually, automation is just a matter of configuration.

This allows our developers, sales people and customers to update their local installations by downloading the latest version by pulling the latest image from Docker Hub.

docker seed distribution

Step 4 – Seeding

Since all data is packaged along with the MongoDB client, seeding is a breeze. It is basically a matter of running two docker commands:

To populate the database, the seed image is started with a link to the MongoDB container, and restores the database using the bundled database dump:

docker run --rm --link ardoq_mongodb_1:mongodb ardoq/demo-seed mongorestore -h mongodb /work/demo_seed/

To populate the binary attachments, the seed image is started again with the data image volumes mounted, and simply copies the files over and exits:

docker run --rm --volumes-from ardoq_api_1 ardoq/demo-seed:latest cp -r /work/attachments /data

docker seed populate

Summary

Keeping on-site installations up-to-date with the latest data used to be time consuming and error prone. Using Docker we are well on our way to automating the entire process.

Learn how you can automatically visualize your Docker stack in Ardoq.

Stay tuned for more “behind the scenes” articles!


Subscribe to our newsletter

If you want more updates on Ardoq and Docker, subscribe to our newsletter here: