Sunday, August 22, 2021

Managing DNS Servers with Ansible and Jenkins (Unbound, BIND)

DNS is a vital component of all computer networks. Also known as the "Internet Yellow Pages," this service is consumed by every household.

DNS services are typically deployed in several patterns to support users and systems:

  • DNS Forwarder: This deployment method is the most common. Everybody needs name resolution - caching and forwarding DNS results can save you bandwidth and improve localized performance. Most appliances can do this out of the box, and if they don't, try it out! It's really easy and will help you learn how DNS works.
    • Use case: You don't have your own domain and use computers.
  • Managed Public DNS: This deployment method is a significant majority of public domains are managed this way. You pay a third-party provider to manage the authoritative registration of public DNS records
    • Use case: You have a business and own a domain, but don't have any internal resources that you need to resolve.
    • Use case: You have a business and own a domain, but don't want to manage publicly resolvable nameservers
  • Private/Internal Nameserver: This deployment method is typically enterprise-specific, but is also required for home labs and all manner of weird experiments. Since it's not on the internet, we can violate any and all manner of Internet conventions.
    • The first component here is a recursive nameserver because even if you run a second server for recursive lookups, you still need a second server for recursive lookups.
    • Authoritative zones: For any given domain, keep a zone file to resolve against. This will include name-to-record (forward) objects and record-to-name (reverse) objects in separate files.
    • A method to change everything above, this has a high benefit:effort ratio.
For this post, we'll build the structure to have an internal nameserver managed completely from source control. This is surprisingly easy to get started - performing this work with abstraction is a welcome convenience, but not initially necessary as zone files are typically very simple and the application (Bind 9 or Unbound) is only one service.

To perform this, we'll follow this procedure:

  • Install the service - in this case, we'll use CentOS for Bind9 (my old setup), and Debian 11 for Unbound (because Debian 11 is new).
  • Extract the configuration file, and then export it into source control.
  • Create zone files, and then export it into source control
  • Automate delivery from source control to what we'll now call the "DNS Worker Node"

Bind9

dnf install bind
find / -name 'named.conf'
cat /etc/named/named.conf
Example named configuration file (Credit where it's due, the vast majority of this configuration has been provided by CentOS and Bind9 - I set the forwarders, allow-query, listen-on, and zone directives:
options {
        listen-on { any; };
        listen-on-v6 { any; };
        directory       "/var/named";
        dump-file       "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";
        secroots-file   "/var/named/data/named.secroots";
        recursing-file  "/var/named/data/named.recursing";
        allow-query { 10.0.0.0/8; 127.0.0.1; 2000::/3; };
        forwarders { 1.1.1.1; 9.9.9.9; };
        /*
         - If you are building an AUTHORITATIVE DNS server, do NOT enable recursion.
         - If you are building a RECURSIVE (caching) DNS server, you need to enable
           recursion.
         - If your recursive DNS server has a public IP address, you MUST enable access
           control to limit queries to your legitimate users. Failing to do so will
           cause your server to become part of large scale DNS amplification
           attacks. Implementing BCP38 within your network would greatly
           reduce such attack surface
        */
        recursion yes;

        dnssec-enable yes;
        dnssec-validation yes;

        managed-keys-directory "/var/named/dynamic";

        pid-file "/run/named/named.pid";
        session-keyfile "/run/named/session.key";

        /* https://fedoraproject.org/wiki/Changes/CryptoPolicy */
        include "/etc/crypto-policies/back-ends/bind.config";
        
};

zone "engyak.net" in {
        allow-transfer { any; };
        file "/etc/named/engyak.net.zone";
        type master;
};
Then, let's build a zone file in source control. Please note that there are additional conventions that should be followed when creating new DNS zone records, this is just an example file that will run!
$TTL 2d
@               SOA             ns.engyak.net. hostmaster.engyak.net  (
                                1      ; serial
                                3600            ; refresh
                                600             ; retry
                                608400          ; expiry
                                3600 ) ;
;
;
engyak.net.     IN NS           ns.engyak.net.
ns              IN A            10.0.0.1
johnnyfive      IN A            10.1.1.1
duncanidaho     IN A            10.2.2.2
Copy the named.conf contents into a new source code repository or your existing one, preferably in an organized fashion. Ansible playbook execution is very straightforward. I'd recommend building this in source control as well - see above note about potential process improvements
---
- hosts: ns.engyak.net
  tasks:
    - name: "Update DNS Zones!"
      copy:
        src: zonefiles/engyak.net
        dest: /etc/named/engyak.net.zone
        mode: "0644"
    - name: "Update DNS Config!"
      copy:
        src: conf.d/ns.engyak.net/named.conf
        dest: /etc/named.conf
        mode: "0640"
    - name: "Restart Named!"
      service:
        name: "named"
        state: "restarted"

Any time you run this playbook it will download a fresh configuration and zone file, then restart Bind9.

As a cherry on top, let's make this process smart - if we want to automatically deploy changes to DNS from source control, we need a CI Tool like Jenkins. Start off by creating a new Freeform pipeline to "Watch SCM" - yes, this isn't a real repository.




That's it - add entries, live long, and prosper! Since the Ansible playbook and supporting files are fetched via source control, the only setup required on a DNS worker node is to establish a relationship between it and the CI tool, ex. SSH authentication.

Unbound

Unbound is a newer DNS server project and has quite a few interesting properties. I've been using BIND for well over a decade - and Unbound aims to change a few things, notably:
Oddly enough, there is no features list for this software package, but pretty much everything else is impressively documented. Let's start the installation:
apt install unbound
cat /usr/share/doc/unbound/examples/unbound.conf

Unbound can use the same zonefile format as BIND, so we only need to create a new config file to migrate things over. Note: This is not a production-ready configuration, it's just enough to get me started. 

As I learn more about Unbound, I'll be using source control to implement changes / implement a rollback - an important benefit when making lots of mistakes!


# The server clause sets the main parameters.
server:
        verbosity: 1
        num-threads: 2
        interface: 0.0.0.0
        interface: ::0
        port: 53
        prefer-ip4: no
        edns-buffer-size: 1232

        # Maximum UDP response size (not applied to TCP response).
        # Suggested values are 512 to 4096. Default is 4096. 65536 disables it.
        max-udp-size: 4096
        msg-buffer-size: 65552
        udp-connect: yes
        unknown-server-time-limit: 376

        do-ip4: yes
        do-ip6: yes
        do-udp: yes
        do-tcp: yes

        # control which clients are allowed to make (recursive) queries
        # to this server. Specify classless netblocks with /size and action.
        # By default everything is refused, except for localhost.
        access-control: 10.0.0.0/8 allow
        access-control: 127.0.0.0/8 allow

        private-domain: "engyak.net"
        caps-exempt: "engyak.net"
        domain-insecure: "engyak.net"

        private-address: 10.0.0.0/8

        # cipher setting for TLSv1.2
        tls-ciphers: "ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256"
        # cipher setting for TLSv1.3
        tls-ciphersuites: "TLS_AES_128_GCM_SHA256:TLS_AES_128_CCM_8_SHA256:TLS_AES_128_CCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256"

# Python config section. To enable:
# o use --with-pythonmodule to configure before compiling.
# o list python in the module-config string (above) to enable.
#   It can be at the start, it gets validated results, or just before
#   the iterator and process before DNSSEC validation.
# o and give a python-script to run.
python:
        # Script file to load
        # python-script: "/etc/unbound/ubmodule-tst.py"

# Dynamic library config section. To enable:
# o use --with-dynlibmodule to configure before compiling.
# o list dynlib in the module-config string (above) to enable.
#   It can be placed anywhere, the dynlib module is only a very thin wrapper
#   to load modules dynamically.
# o and give a dynlib-file to run. If more than one dynlib entry is listed in
#   the module-config then you need one dynlib-file per instance.
dynlib:
        # Script file to load
        # dynlib-file: "/etc/unbound/dynlib.so"

# Remote control config section.
remote-control:
        # Enable remote control with unbound-control(8) here.
        # set up the keys and certificates with unbound-control-setup.
        control-enable: no

# Authority zones
# The data for these zones is kept locally, from a file or downloaded.
# The data can be served to downstream clients, or used instead of the
# upstream (which saves a lookup to the upstream).  The first example
# has a copy of the root for local usage.  The second serves example.org
# authoritatively.  zonefile: reads from file (and writes to it if you also
# download it), primary: fetches with AXFR and IXFR, or url to zonefile.
# With allow-notify: you can give additional (apart from primaries) sources of
# notifies.
forward-zone:
      name: "."
      forward-addr: 1.1.1.1
      forward-addr: 9.9.9.9
auth-zone:
      name: "engyak.net"
      for-downstream: yes
      for-upstream: yes
      zonefile: "engyak.net.zone"

To automate file delivery here, we'll use a (similar) playbook for Unbound. The Jenkins configuration will not need to be modified, because the playbook will automatically be re-executed.

---
- hosts: ns.engyak.net
  tasks:
    - name: "Update DNS Zones!"
      copy:
        src: zonefiles/engyak.net
        dest: /etc/unbound/engyak.net.zone
        mode: "0644"
    - name: "Update DNS Config!"
      copy:
        src: conf.d/ns.engyak.net/unbound.conf
        dest: /etc/unbound.conf
        mode: "0640"
    - name: "Restart Unbound!"
      service:
        name: "unbound"
        state: "restarted"

Some Thoughts

This method of building DNS records from a source of truth does replace the master-slave (sorry guys, BIND's terms are not my own!) relationship older name servers will typically use. Personally, I like this method of propagation.

The biggest upside here is that a DNS worker node being unavailable does not prevent an engineer from adding/modifying records as long as recursive name servers support multiple resolvers.

It is eventually consistent, as the orchestrator will update every worker node for you. This may be slower or faster, depending on TTL.

The Ansible playbook I used here will kill your DNS node if you push it into an invalid configuration, so this is probably not production-worthy without additional work.

If you would rather purchase a platform instead of building this capability with F/OSS components, this is basically how Infoblox Grid works.

It'd be really neat to abstract software-specific constructs, which can be done with Python and Jinja2 (or just Ansible and Jinja2!)

No comments:

Post a Comment

Get an A on ssllabs.com with VMware Avi / NSX ALB (and keep it that way with SemVer!)

Cryptographic security is an important aspect of hosting any business-critical service. When hosting a public service secured by TLS, it is ...