Sunday, November 29, 2020

Using nomad to deploy/manage containers (on a mostly IPv6 network)


This blog will show an overview of our container deployment using Nomad and IPv6-(only) where possible.

The focus for this setup was to empower developers and keeping things simple for them and us (the admins).
They can configure firewalls, storage, traefik, internet accessibility and more and have an https enabled project running in minutes.

As we run our own physical datacenters we have the luxury to use IPv6 where we want/can and do not need the mess of overlay networks and NAT.

Our developers create a nomadjob using a nomadgen.toml file which simplifies the nomad hcl plans (and extends it with features nomad itself cannot do).
You can find an example below. Most of the lines are self explanatory, i've added some comments

This nomadgen.toml gets checked into gitea, where Jenkins will pick it up and send it to nomadguard.

Nomadguard will:

  • turn the toml back into a complete hcl with all of our infrastructure parameters filled in.
  • run our nomad validator on it which checks if the user specified correct settings, uses the correct namespace and authorizations etc.
  • runs nomad modifier on it, which allows us to modify jobs to automatically scale multi-containers over our 2 datacenters, move on specific nodes, limit memory, cpu or other things we need to do without developer interaction.

This modified nomad job will then get deployed to the nomad scheduler which will send it to a nomad node that has the resources the developer asked for.

On that nomad node the nomad agent will contact vault to resolve any needed secrets and it will start the docker container. At startup the incoming and outgoing firewall for that container will be configured. As we're using IPv6 our containers are routable and direct accessible, so we need to have a firewall in place for that. We use a modified registrator to add specific ipset entries to allow access to or from the container.

At the same time nomad will also register it's IPv6 addresses in our consul DNS (so traefik can start sending traffic to it).

If the container is running a http/https service this will be automatically exposed on (where p is the first letter of the tier (production/quality/test))

All of this infrastructure around the containers is IPv6 only and the containers themselves only allow IPv6 ingress, but they do have dualstack egress as some of the services they need (on the inter- or intranet) are not dualstack yet.

At the edge we have a netscaler that talks dualstack (IPv4/IPv6) to the users and moves to IPv6 only when talking to traefik and containers.

We're using default docker from Centos 7 and every nomad node is getting a IPv6 /80 range routed which it can use to give the containers their ipv6 addresses.

Extra tooling

We have some more tooling available for developers as they need to debug their deployments. This is where we have nomadctl which allows them to ask information about their job, see logs (coming from elasticsearch) and enter containers.

$ nomadctl ps cerebro
Exec ID    |Job/Task        |Node          |Uptime     |CPU |Mem(max)         |Extra
           |p-es-cerebro OK |              |           |    |                 |
c4996e3c72 |p-es-cerebro    |p-cloud-dc1-9 |3 days ago |26  |948 MiB(1.2 GiB) |
59702e6d47 |p-es-cerebro    |p-cloud-dc2-8 |3 days ago |27  |872 MiB(1.1 GiB) |

Or exec into a container

$ nomadctl exec c4996e3c72
Welcome wim (ssh cert verified)
welcome to p-es-cerebro on p-cloud-dc1-9
# ss -an | grep 9000
tcp    LISTEN     0      100                          :::9000                                     :::*


Of course there were issues, but not that many ;-)

  • Especially in the early days of our setup we had some IPv6 issues in the hashicorp tools, but as they are opensource it's easy to fix those. (in contrast to hardware vendors where bugs are ignored or takes years to fix ..)
  • Nomad 0.8 to 0.9 was troublesome because a lot of nomad stuff internally was rewritten and caused some issues in our setup.

The main takeaway after 5 years is that the nomad/consul/vault infrastructure is really solid and needs no babysitting.
And yes, IPv6 is (mostly) ready for production!