1. 🧭 Scope Legend
Use these markers to distinguish fixed instructions from values that identify this environment and workflow.
2. 🎯 Purpose and Scope
DOCUMENT REVISION
Version: 2026-06-16.2
Purpose: Make every non-DNS Ubuntu clone use prod-dns-01 exclusively while retaining DHCP addressing.
SCOPE
Applies to Vault, certificate issuer, proxies, Terraform deployment servers,
web/app servers, Kubernetes nodes, and future non-DNS clones.
EXCLUSION
prod-dns-01 is configured by bootstrap-prod-dns-01.mdx and must not receive this client role.Netplan continues to obtain the IP address, route, and gateway from DHCP. Only DHCP-provided DNS and search domains are rejected. Public DNS resolution remains available through BIND forwarding on 192.168.8.4.
tmplt-ub-26-min-base page.3. 🧱 Resolver Inputs
4. 🧭 Inventory Boundary
[prod_dns]
prod-dns-01 ansible_host=192.168.8.4
[internal_dns_clients:children]
vault
certificate_issuers
proxies
terraform_deploy
web
app
k8s
# Add the real hosts under the child groups above.
# Do not add prod-dns-01 to internal_dns_clients.The client group can contain Vault, certificate issuer, proxy, Terraform deployment, web, app, and Kubernetes hosts. The DNS server itself must remain outside this group.
5. 🧩 Create the Reusable Ansible Role
5.1 envs/common/ansible/roles/internal-dns-resolver/defaults/main.yml
---
internal_dns_server: "192.168.8.4"
internal_dns_interface_override: ""
internal_dns_netplan_path: "/etc/netplan/99-internal-dns.yaml"
internal_dns_internal_test_name: "harbor.aspireclan.com"
internal_dns_internal_test_expected_ipv4: "192.168.8.5"
internal_dns_public_test_name: "ubuntu.com"
internal_dns_excluded_hosts:
- "prod-dns-01"5.2 envs/common/ansible/roles/internal-dns-resolver/templates/99-internal-dns.yaml.j2
network:
version: 2
renderer: networkd
ethernets:
{{ internal_dns_interface }}:
dhcp4: true
dhcp4-overrides:
use-dns: false
use-domains: false
nameservers:
addresses:
- {{ internal_dns_server }}5.3 envs/common/ansible/roles/internal-dns-resolver/tasks/main.yml
---
- name: Refuse to apply the client resolver role to the DNS server
ansible.builtin.assert:
that:
- inventory_hostname not in internal_dns_excluded_hosts
fail_msg: >-
This role is for non-DNS clones. Configure prod-dns-01 through the
bootstrap-prod-dns-01 procedure instead.
- name: Select the primary network interface
ansible.builtin.set_fact:
internal_dns_interface: >-
{{
internal_dns_interface_override
if internal_dns_interface_override | length > 0
else ansible_default_ipv4.interface
}}
- name: Validate the selected interface
ansible.builtin.assert:
that:
- internal_dns_interface | length > 0
- internal_dns_interface in ansible_interfaces
fail_msg: "Unable to identify a valid primary interface for internal DNS."
- name: Prove prod-dns-01 resolves the required internal record before changing Netplan
ansible.builtin.command:
argv:
- dig
- "@{{ internal_dns_server }}"
- "{{ internal_dns_internal_test_name }}"
- A
- +short
register: internal_dns_direct_internal_query
changed_when: false
failed_when: >-
internal_dns_internal_test_expected_ipv4 not in
internal_dns_direct_internal_query.stdout_lines
- name: Prove prod-dns-01 forwards a public query before changing Netplan
ansible.builtin.command:
argv:
- dig
- "@{{ internal_dns_server }}"
- "{{ internal_dns_public_test_name }}"
- A
- +short
register: internal_dns_direct_public_query
changed_when: false
failed_when: >-
internal_dns_direct_public_query.stdout_lines |
select('match', '^[0-9]+(\\.[0-9]+){3}$') |
list |
length == 0
- name: Install the internal DNS Netplan override
ansible.builtin.template:
src: 99-internal-dns.yaml.j2
dest: "{{ internal_dns_netplan_path }}"
owner: root
group: root
mode: "0600"
register: internal_dns_netplan
- name: Validate the complete Netplan configuration
ansible.builtin.command: netplan generate
changed_when: false
- name: Apply the internal DNS Netplan override
ansible.builtin.command: netplan apply
when: internal_dns_netplan.changed
- name: Wait for SSH after applying Netplan
ansible.builtin.wait_for_connection:
delay: 2
sleep: 2
timeout: 90
- name: Flush systemd-resolved caches
ansible.builtin.command: resolvectl flush-caches
changed_when: false
- name: Verify that the selected link uses only prod-dns-01
ansible.builtin.shell: |
set -euo pipefail
actual="$(resolvectl dns '{{ internal_dns_interface }}' | sed -E 's/^Link [0-9]+ \([^)]*\):[[:space:]]*//' | xargs)"
if [ "${actual}" != "{{ internal_dns_server }}" ]; then
echo "ERROR: Expected only {{ internal_dns_server }}, found: ${actual}" >&2
exit 1
fi
if printf '%s\n' "${actual}" | grep -Eq '(^|[[:space:]])(1\.1\.1\.1|8\.8\.8\.8)([[:space:]]|$)'; then
echo "ERROR: A public resolver remains on the client link." >&2
exit 1
fi
args:
executable: /bin/bash
changed_when: false
- name: Verify internal resolution through the system resolver
ansible.builtin.command:
argv:
- getent
- ahostsv4
- "{{ internal_dns_internal_test_name }}"
register: internal_dns_system_internal_query
changed_when: false
failed_when: >-
internal_dns_internal_test_expected_ipv4 not in
internal_dns_system_internal_query.stdout
- name: Verify public resolution through BIND forwarding
ansible.builtin.command:
argv:
- getent
- ahostsv4
- "{{ internal_dns_public_test_name }}"
changed_when: false6. 📄 Create the Main Playbook
6.1 envs/common/ansible/configure-internal-dns-resolver.yml
---
- name: Configure the internal resolver on non-DNS Ubuntu clones
hosts: internal_dns_clients
become: true
gather_facts: true
serial: "25%"
roles:
- role: internal-dns-resolverThe rolling serial value prevents all DNS clients from being reconfigured at once and stops the rollout when a batch fails validation.
7. ▶️ Apply to Non-DNS Clones
Run only after bootstrap-prod-dns-01.mdx reports a passing DNS health gate.
set -euo pipefail
ansible-playbook \
-i "envs/common/ansible/inventory.ini" \
"envs/common/ansible/configure-internal-dns-resolver.yml" \
--diff8. 🔄 Prove DHCP Renewal and Reboot Persistence
8.1 envs/common/ansible/verify-internal-dns-persistence.yml
---
- name: Prove internal DNS survives DHCP renewal and reboot on one canary clone
hosts: internal_dns_clients
become: true
gather_facts: true
serial: 1
any_errors_fatal: true
vars:
internal_dns_canary_required: true
pre_tasks:
- name: Require an explicit one-host limit
ansible.builtin.assert:
that:
- ansible_play_hosts_all | length == 1
fail_msg: >-
Run this persistence test with --limit and exactly one non-DNS canary host.
tasks:
- name: Include the internal DNS resolver role
ansible.builtin.include_role:
name: internal-dns-resolver
- name: Renew the DHCP lease through systemd-networkd
ansible.builtin.command:
argv:
- networkctl
- renew
- "{{ internal_dns_interface }}"
changed_when: true
- name: Wait for SSH after DHCP renewal
ansible.builtin.wait_for_connection:
delay: 3
sleep: 3
timeout: 120
- name: Re-verify DNS after DHCP renewal
ansible.builtin.include_role:
name: internal-dns-resolver
- name: Reboot the canary clone
ansible.builtin.reboot:
reboot_timeout: 600
post_reboot_delay: 10
- name: Re-verify DNS after reboot
ansible.builtin.include_role:
name: internal-dns-resolver8.2 Run against exactly one canary
set -euo pipefail
CANARY_HOST="dev-app-01"
ansible-playbook \
-i "envs/common/ansible/inventory.ini" \
"envs/common/ansible/verify-internal-dns-persistence.yml" \
--limit "${CANARY_HOST}" \
--diffAfter the canary passes, the same Netplan file is safe for the remaining non-DNS clones. The test proves that a DHCP renewal cannot reintroduce router-advertised public resolvers and that the setting survives reboot.
9. 🧪 Manual Validation
Run on any configured clone. The expected final DNS list is exactly 192.168.8.4.
set -euo pipefail
DNS_SERVER="192.168.8.4"
INTERNAL_NAME="harbor.aspireclan.com"
INTERNAL_IP="192.168.8.5"
PUBLIC_NAME="ubuntu.com"
PRIMARY_IFACE="$(ip -o route show default | awk '{print $5; exit}')"
actual="$(resolvectl dns "${PRIMARY_IFACE}" | sed -E 's/^Link [0-9]+ \([^)]*\):[[:space:]]*//' | xargs)"
printf 'Interface: %s\nDNS: %s\n' "${PRIMARY_IFACE}" "${actual}"
test "${actual}" = "${DNS_SERVER}"
! printf '%s\n' "${actual}" | grep -Eq '(^|[[:space:]])(1\.1\.1\.1|8\.8\.8\.8)([[:space:]]|$)'
getent ahostsv4 "${INTERNAL_NAME}" | grep -F "${INTERNAL_IP}"
getent ahostsv4 "${PUBLIC_NAME}" >/dev/null
dig "@${DNS_SERVER}" "${INTERNAL_NAME}" A +short | grep -Fx "${INTERNAL_IP}"
dig "@${DNS_SERVER}" "${PUBLIC_NAME}" A +short | grep -Eq '^[0-9]+(\.[0-9]+){3}$'
echo 'INTERNAL DNS CLIENT VALIDATION: PASS'10. ↩️ Rollback
Use the Proxmox console if SSH or DNS is unavailable. This removes only the clone-side override and returns the VM to DHCP-provided DNS.
set -euo pipefail
sudo rm -f /etc/netplan/99-internal-dns.yaml
sudo netplan generate
sudo netplan apply
sudo resolvectl flush-caches
PRIMARY_IFACE="$(ip -o route show default | awk '{print $5; exit}')"
resolvectl dns "${PRIMARY_IFACE}"11. 🏁 Finished State
ACCEPTANCE CHECKPOINT
DHCP addressing: retained
Router MAC-to-IP reservation: retained
Client DNS server: 192.168.8.4 only
Public resolvers on clients: absent
Internal record: harbor.aspireclan.com -> 192.168.8.5
Public forwarding test: ubuntu.com resolves
DHCP renewal persistence: passed on a canary
Reboot persistence: passed on a canary
DNS server excluded from role: prod-dns-01