Monday, September 28, 2015

How to create an IPv6-only consul cluster with docker


  • we're using docker to run consul (and registrator and our services) in, and IPv6 makes this easier (no NAT => better performance)
  • it's easier to maintain one stack
  • consul is known to give issues with NAT and docker (
  • IPv4 is legacy and obsolete ;-)
Consul 0.5.2 has some issues running such a setup, but if you're building consul from master (which includes some fixes (see it will work fine.

Issues to be aware of:

  • the IPv4 version of consul listens by default on private address ranges, when using IPv6 you'll be running on 'public' addresses. So be sure you're firewalling those from the internet.
  • If you're using consul recursive powers, you'll also need IPv6 dns recursors. (e.g. google's 2001:4860:4860::8888)
  • Not IPv6 related, but for extra stability, enable leave_on_terminate.
  • Also not Ipv6 related, but I've noticed that the default LAN settings for consul can be a bit too strict when running on vmware hosts. This patch increase the probetimeout to 2 seconds (instead of 500msec)

Consul extra configuration server and client

Extra settings below necessary for the consul server and client agent setup

        "recursor": "[2001:4860:4860::8888]",
        "leave_on_terminate": true,
        "client_addr": "::",
        "addresses": { "http": "::"}

Consul server setup

The consul server are running as a docker host mode container (which means, they share the same network namespace as the host).

The reason here is that we need a fixed IPv6 address for the servers because we're forwarding our dns requests to those servers. (ofcourse with some extra work we could make a script that dynamically update our dns forwards to the dynamic IP address).

Our server has multiple IPv6 addresses so we'll have to add a -advertise and -bind flag

consul agent -server -advertise 2001:db8::1 -bind 2001:db8::1 -bootstrap-expect 3 -retry-join [2001:db8::1]:8301 -retry-join [2001:db8::2]:8301 -retry-join [2001:db8::3]:8301

Using consul-docker as our consul docker container (for client and server)

Consul client setup 

You'll need to cherry-pick this PR into your local build:
The IPv6 address in the docker container will be random and we want to bind to the IPv6 address.
This patch looks for the first 'public' IPv6 address and uses this address to advertise.

So we start the client with:

consul agent -bind :: -join consul.service.consul

Gotcha's here:
bind :: actually binds to IPv4 and IPv6 addresses in the container, but because we advertise the IPv6 address the IPv4 address won't be used.

Other software


We also use registrator to register our services in consul. So every time a container starts or stops, registrator handles the consul service registration process.

Also for registrator some extra fixes are needed to have IPv6 support. (not yet merged, see

Because we're running consul on IPv6 this means registrator also needs to connect to the IPv6 address.

registrator consul://server1.node.consul:8500

Registrator then can register other services that are running on the docker host, like e.g elasticsearch.


Besides main registrator we also run registrator-netfilter which automatically firewalls the IPv6 services in the container. The containers are no longer NATted but directly accessible, so they need to be firewalled.


A /64 is allocated for docker and a /80 is given to each docker host, running with the switches

--ipv6=true --fixed-cidr-v6=2001:db8::/80


ES is also run ipv6 only, using registrator, registrator-netfilter and consul.
You can find the relevant commands to give to docker below:

docker run --net bridge -e SERVICE_NAME=es -e SERVICE_9200_TAGS=http-data 
-e SERVICE_9300_TAGS=transport-data -e SERVICE_9200_IPV6=tcp -e SERVICE_9300_IPV6=tcp 


  1. Hi there,

    Can I ask you something totally unrelated to this article? (I find no other communication means so hope you won't be troubled)

    I installed cssh and tried to use it with a Nexus 6000.
    Any individual command works and all is perfect BUT when I try to give it a series of a commands from a file (even the example one with some changes to it to match my Nexus), then it just hangs and does nothing until I eventually get bored and quit it.
    Doing an strace shows only "connection timed out".
    To be more explicit after "getsockname and getpeername" with strace, it shows:
    }read(3, 0xc20802ab08, 1) = -1 EAGAIN (Resource temporarily unavailable)

    Any idea how I can make it work or what is wrong in fact?


  2. In fact it seems to only have problems if a conf t is passed.
    In my case this gives the prompt:
    from where cssh seems to hang

    1. Mihai, could you open an issue on github for this ?
      I guess it's going to be a parsing error, which should be fixed in master, but I guess you used the latest binary?