Skip to main content

Terraform Deploy VM Preparation (prod-terraform-deploy-01)

1. 🧭 Scope Legend

Use these markers to distinguish fixed preparation steps, per-VM identity values, and router-side DHCP work.

1.COMMON · NO CHANGE NEEDEDFollow these instructions for every Terraform deployment VM prepared from the approved base template.
2.CHANGE PER DEPLOYED VMUse values that uniquely identify this VM, including name, VM ID, MAC address, reserved IP address, and resources.
3.CHANGE IN ROUTER / DHCPCreate or verify the permanent MAC-to-IP reservation before the VM first boots.

2. 🧭 Network Design Decision

COMMON · NO CHANGE NEEDEDCHANGE IN ROUTER / DHCPKeep DHCP enabled inside Ubuntu and use a permanent router-side DHCP reservation. Do not configure the same address statically inside Netplan.
Selected approach: reserve 192.168.8.57 for MAC address AA:BB:CC:01:01:02. Ubuntu remains on DHCP, while the router centrally supplies the stable address, gateway, subnet, and DNS settings.

3. 🧱 Manual Clone Inputs

CHANGE PER DEPLOYED VMCHANGE IN ROUTER / DHCPReview the source template, target VM, Proxmox placement, permanent network identity, and initial resources. Values persist in this browser.

4. 📋 Final VM Profile

CHANGE PER DEPLOYED VMCHANGE IN ROUTER / DHCPConfirm the exact target identity and network mapping before creating the full clone.
Source Template:
9000 / tmplt-ub-26-min-base
Target VM:
<<CHOOSE_UNUSED_VM_ID>> / prod-terraform-deploy-01
Clone Type:
Full Clone
Permanent MAC / DHCP Reservation:
AA:BB:CC:01:01:02 → 192.168.8.57
Initial Resources:
2 cores · 4096 MiB RAM · 40 GiB disk
Terraform disk rule: the Terraform disk definition must match the template boot-disk layout. Either omit the disk block and inherit the template disk or keep slot = "scsi0", storage local-lvm, and size 40G or larger.

5. 🔒 Create the Router DHCP Reservation Before First Boot

CHANGE IN ROUTER / DHCPCHANGE PER DEPLOYED VMReserve the permanent address before the VM sends its first DHCP request.
Name: prod-terraform-deploy-01
MAC address: AA:BB:CC:01:01:02
Reserved IPv4 address: 192.168.8.57
  • Confirm the address and MAC are unused.
  • Keep the template MAC reserved only for the template-build identity.
  • Do not start the VM before the reservation exists.

6. 🧬 Create a Manual Full Clone in Proxmox

COMMON · NO CHANGE NEEDEDCHANGE PER DEPLOYED VMCreate an independent full clone from the read-only base template.
  1. Select template tmplt-ub-26-min-base, VM ID 9000.
  2. Select Clone.
  3. Use target node pve, VM ID <<CHOOSE_UNUSED_VM_ID>>, and name prod-terraform-deploy-01.
  4. Select Full Clone and storage local-lvm.
  5. Wait for the clone task to complete.
  6. Do not start the clone yet.

7. ⚙️ Configure the Cloned VM Before First Boot

CHANGE PER DEPLOYED VMSet the permanent MAC and review the cloned hardware while the VM is powered off.
  • Bridge: vmbr0
  • NIC model: VirtIO
  • MAC: AA:BB:CC:01:01:02
  • CPU: 2 cores
  • Memory: 4096 MiB
  • Disk: at least 40 GiB
  • QEMU Guest Agent: enabled

8. 🚀 Start the VM and Verify the DHCP Reservation

COMMON · NO CHANGE NEEDEDCHANGE PER DEPLOYED VMCHANGE IN ROUTER / DHCPBoot the clone and verify its reserved address, route, DNS, and internet access.
ip -brief link
ip -brief address
ip route
resolvectl status
netplan status --all 2>/dev/null || true

DEFAULT_GATEWAY="$(ip route show default | awk '{print $3; exit}')"
test -n "$DEFAULT_GATEWAY"

ping -c 4 "$DEFAULT_GATEWAY"
getent hosts ubuntu.com
curl -I https://download.docker.com

9. 🏷️ Set the Permanent Ubuntu Hostname

CHANGE PER DEPLOYED VMReplace the inherited template hostname with the permanent VM hostname.
sudo hostnamectl set-hostname prod-terraform-deploy-01

if grep -qE '^127\.0\.1\.1[[:space:]]+' /etc/hosts; then
  sudo sed -i -E 's/^127\.0\.1\.1[[:space:]].*/127.0.1.1 prod-terraform-deploy-01/' /etc/hosts
else
  echo '127.0.1.1 prod-terraform-deploy-01' | sudo tee -a /etc/hosts
fi

hostnamectl --static
grep -E '^127\.0\.1\.1[[:space:]]+' /etc/hosts

10. 🪪 Verify Clone-Specific Machine Identity and SSH Host Keys

COMMON · NO CHANGE NEEDEDCHANGE PER DEPLOYED VMConfirm the clone has a non-empty machine ID and fresh SSH host keys.
cat /etc/machine-id

for key in /etc/ssh/ssh_host_*_key.pub; do
  sudo ssh-keygen -lf "$key" -E sha256
done

If the host keys are unexpectedly missing, repair them with:

sudo ssh-keygen -A
sudo systemctl restart ssh
ls -l /etc/ssh/ssh_host_*

11. 🐳 Verify the Base Template Software

COMMON · NO CHANGE NEEDEDConfirm the clone retained QEMU Guest Agent, SSH, Docker Engine, Buildx, Compose, and containerd.
systemctl is-active qemu-guest-agent
systemctl is-active ssh
systemctl is-active docker
systemctl is-active containerd

docker --version
docker compose version
docker buildx version
containerd --version
docker ps

12. 🟢 Install Node.js for GitHub Actions

COMMON · NO CHANGE NEEDEDInstall an active Node.js LTS release so JavaScript-based GitHub Actions can execute.
sudo apt update
sudo apt install -y ca-certificates curl gnupg

curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash -
sudo apt install -y nodejs

node --version
npm --version

13. ☁️ Install AWS CLI v2 for Terraform State Automation

COMMON · NO CHANGE NEEDEDInstall AWS CLI v2 for GitHub OIDC-based Terraform state workflows.
sudo apt update
sudo apt install -y ca-certificates curl unzip

set -euo pipefail

case "$(uname -m)" in
  x86_64)
    aws_cli_url="https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip"
    ;;
  aarch64|arm64)
    aws_cli_url="https://awscli.amazonaws.com/awscli-exe-linux-aarch64.zip"
    ;;
  *)
    echo "ERROR: Unsupported architecture: $(uname -m)"
    exit 1
    ;;
esac

work_dir="$(mktemp -d)"
trap 'rm -rf "$work_dir"' EXIT

curl --fail --location --silent --show-error "$aws_cli_url" --output "$work_dir/awscliv2.zip"
unzip -q "$work_dir/awscliv2.zip" -d "$work_dir"

if command -v aws >/dev/null 2>&1; then
  sudo "$work_dir/aws/install" --update --install-dir /usr/local/aws-cli --bin-dir /usr/local/bin
else
  sudo "$work_dir/aws/install" --install-dir /usr/local/aws-cli --bin-dir /usr/local/bin
fi

aws --version
sudo -u acllc -H bash -lc 'command -v aws && aws --version'
Credential rule: do not run aws configure. Workflows must use GitHub OIDC and temporary role credentials.

14. ⚙️ Install Ansible and Create the Automation SSH Identity

COMMON · NO CHANGE NEEDEDInstall Ansible and prepare the dedicated SSH identity used for post-provisioning and approved deployment workflows.
sudo apt update
sudo apt install -y software-properties-common ca-certificates curl gnupg

sudo add-apt-repository --yes --update ppa:ansible/ansible
sudo apt install -y ansible

ansible --version
mkdir -p ~/ansible/{inventories,playbooks,roles,group_vars,host_vars}

14.1 Create or retain the dedicated automation key

mkdir -p "$HOME/.ssh"
chmod 700 "$HOME/.ssh"

if [ ! -f "$HOME/.ssh/id_ed25519_ansible" ]; then
  ssh-keygen     -t ed25519     -f "$HOME/.ssh/id_ed25519_ansible"     -C "prod-terraform-deploy-01 Ansible automation"     -N ""
else
  echo "Automation key already exists."
fi

chmod 600 "$HOME/.ssh/id_ed25519_ansible"
chmod 644 "$HOME/.ssh/id_ed25519_ansible.pub"

ssh-keygen -lf "$HOME/.ssh/id_ed25519_ansible.pub" -E sha256
cat "$HOME/.ssh/id_ed25519_ansible.pub"
Private-key boundary: the active plaintext private key belongs only on this deployment VM. Before rebuilding this VM, create the encrypted off-host recovery package in Section 15. Only the public key is placed in the base template and target users' authorized_keys.

14.2 Create an inventory placeholder and verify Ansible

cat > "$HOME/ansible/inventories/dev.ini" <<'EOF'
[web]
# dev-web-01 ansible_host=192.168.8.120 ansible_user=acllc ansible_ssh_private_key_file=~/.ssh/id_ed25519_ansible

[app]
# dev-app-01 ansible_host=192.168.8.6 ansible_user=acllc ansible_ssh_private_key_file=~/.ssh/id_ed25519_ansible

[db]

[k8s]

[runner]
EOF

ansible localhost -m ping -c local

15. 🛟 Back Up, Restore, and Republish the Automation SSH Identity

COMMON · NO CHANGE NEEDEDCHANGE PER DEPLOYED VMCreate and verify an encrypted off-host backup before destroying the old deploy VM. During recovery, reuse only the public key in the base template and restore the private key only to the rebuilt deploy VM.
Critical recovery rule: GitHub Actions secrets are write-only after creation; the existing value cannot be downloaded from GitHub. The recoverable source is /home/acllc/.ssh/id_ed25519_ansible on the current deploy VM. Back it up before deleting that VM.

15.1 Verify that the private and public keys match

set -euo pipefail

KEY="$HOME/.ssh/id_ed25519_ansible"
PUBLIC_KEY="${KEY}.pub"
DERIVED_PUBLIC_KEY="$(mktemp)"
trap 'rm -f "$DERIVED_PUBLIC_KEY"' EXIT

test -f "$KEY"
test -f "$PUBLIC_KEY"

chmod 700 "$HOME/.ssh"
chmod 600 "$KEY"
chmod 644 "$PUBLIC_KEY"

ssh-keygen -y -P '' -f "$KEY" > "$DERIVED_PUBLIC_KEY"

diff -u   <(awk '{print $1, $2}' "$PUBLIC_KEY")   <(awk '{print $1, $2}' "$DERIVED_PUBLIC_KEY")

echo "Private/public key match: PASS"
ssh-keygen -lf "$PUBLIC_KEY" -E sha256

Record the displayed SHA-256 fingerprint in the recovery record or password manager.

15.2 Create an AES-256 encrypted recovery archive

set -euo pipefail

KEY="$HOME/.ssh/id_ed25519_ansible"
PUBLIC_KEY="${KEY}.pub"
STAGING_DIR="$HOME/secure-key-backup-staging"
TIMESTAMP="$(date -u +%Y%m%dT%H%M%SZ)"
SOURCE_HOST="$(hostnamectl --static)"
ARCHIVE_NAME="${SOURCE_HOST}-id_ed25519_ansible-${TIMESTAMP}.tar.gz.gpg"
ARCHIVE_PATH="${STAGING_DIR}/${ARCHIVE_NAME}"
MANIFEST_PATH="${ARCHIVE_PATH}.manifest.txt"
PUBLIC_COPY_PATH="${STAGING_DIR}/${SOURCE_HOST}-id_ed25519_ansible.pub"

command -v gpg >/dev/null
test -f "$KEY"
test -f "$PUBLIC_KEY"

umask 077
install -d -m 700 "$STAGING_DIR"

tar   --create   --gzip   --directory="$HOME"   --file=-   .ssh/id_ed25519_ansible   .ssh/id_ed25519_ansible.pub |
gpg   --symmetric   --cipher-algo AES256   --output "$ARCHIVE_PATH"

install -m 0644 "$PUBLIC_KEY" "$PUBLIC_COPY_PATH"

ARCHIVE_SHA256="$(sha256sum "$ARCHIVE_PATH" | awk '{print $1}')"
PUBLIC_FINGERPRINT="$(ssh-keygen -lf "$PUBLIC_KEY" -E sha256 | awk '{print $2}')"

{
  echo "source_hostname=${SOURCE_HOST}"
  echo "created_utc=${TIMESTAMP}"
  echo "archive_file=${ARCHIVE_NAME}"
  echo "archive_sha256=${ARCHIVE_SHA256}"
  echo "public_key_fingerprint=${PUBLIC_FINGERPRINT}"
} > "$MANIFEST_PATH"

chmod 600 "$ARCHIVE_PATH" "$MANIFEST_PATH"
chmod 644 "$PUBLIC_COPY_PATH"

echo "Encrypted archive: $ARCHIVE_PATH"
echo "Manifest:          $MANIFEST_PATH"
echo "Public-key copy:   $PUBLIC_COPY_PATH"
cat "$MANIFEST_PATH"

Use a strong GPG passphrase and store it in a password manager separate from the backup files.

15.3 Test decryption before destroying the old VM

set -euo pipefail

STAGING_DIR="$HOME/secure-key-backup-staging"
ARCHIVE_PATH="$(find "$STAGING_DIR" -maxdepth 1 -type f -name '*.tar.gz.gpg' -print -quit)"
VERIFY_DIR="$(mktemp -d)"
trap 'rm -rf "$VERIFY_DIR"' EXIT

test -n "$ARCHIVE_PATH"

gpg --decrypt "$ARCHIVE_PATH" |
tar --extract --gzip --directory="$VERIFY_DIR" --file=-

test -f "$VERIFY_DIR/.ssh/id_ed25519_ansible"
test -f "$VERIFY_DIR/.ssh/id_ed25519_ansible.pub"

ssh-keygen   -y   -P ''   -f "$VERIFY_DIR/.ssh/id_ed25519_ansible"   > "$VERIFY_DIR/derived.pub"

diff -u   <(awk '{print $1, $2}' "$VERIFY_DIR/.ssh/id_ed25519_ansible.pub")   <(awk '{print $1, $2}' "$VERIFY_DIR/derived.pub")

echo "Encrypted archive restore test: PASS"
ssh-keygen -lf "$VERIFY_DIR/.ssh/id_ed25519_ansible.pub" -E sha256

15.4 Copy the recovery package off-host with Windows PowerShell 7

$ErrorActionPreference = "Stop"
Set-StrictMode -Version Latest

$RemoteUser = "acllc"
$DeployVmIp = "192.168.8.57"
$RemoteDirectory = "/home/acllc/secure-key-backup-staging"
$Destination = "D:\secure-backups\prod-terraform-deploy-01\ssh-identity"

New-Item -ItemType Directory -Path $Destination -Force | Out-Null

$RemoteSource = '{0}@{1}:{2}/*' -f $RemoteUser, $DeployVmIp, $RemoteDirectory
scp $RemoteSource $Destination

if ($LASTEXITCODE -ne 0) {
    throw "scp failed with exit code $LASTEXITCODE."
}

$Manifest = Get-ChildItem -LiteralPath $Destination -Filter "*.manifest.txt" |
    Sort-Object LastWriteTimeUtc -Descending |
    Select-Object -First 1

if (-not $Manifest) {
    throw "The recovery manifest was not copied."
}

$ManifestLines = Get-Content -LiteralPath $Manifest.FullName

$ArchiveName = (
    $ManifestLines |
    Where-Object { $_ -like "archive_file=*" } |
    Select-Object -First 1
).Substring("archive_file=".Length)

$ExpectedSha256 = (
    $ManifestLines |
    Where-Object { $_ -like "archive_sha256=*" } |
    Select-Object -First 1
).Substring("archive_sha256=".Length)

$ArchivePath = Join-Path $Destination $ArchiveName
$ActualSha256 = (Get-FileHash -LiteralPath $ArchivePath -Algorithm SHA256).Hash.ToLowerInvariant()

if ($ActualSha256 -ne $ExpectedSha256.ToLowerInvariant()) {
    throw "SHA-256 verification failed for $ArchivePath"
}

Write-Host "Off-host archive verification: PASS"
Write-Host "Archive: $ArchivePath"
Write-Host "SHA-256: $ActualSha256"
Get-Content -LiteralPath $Manifest.FullName

15.5 Use only the backed-up public key while rebuilding the base template

Open the copied *-id_ed25519_ansible.pub file from the trusted backup and use that complete single-line value in the base-template preparation page. Do not decrypt or copy the private key into the template.

$BackupDirectory = "D:\secure-backups\prod-terraform-deploy-01\ssh-identity"

Get-ChildItem -LiteralPath $BackupDirectory -Filter "*-id_ed25519_ansible.pub" |
    Sort-Object LastWriteTimeUtc -Descending |
    Select-Object -First 1 |
    Get-Content

15.6 Remove local staging only after the off-host copy is verified

set -euo pipefail

STAGING_DIR="$HOME/secure-key-backup-staging"
test -d "$STAGING_DIR"

find "$STAGING_DIR" -maxdepth 1 -type f -print
rm -rf -- "$STAGING_DIR"

test ! -e "$STAGING_DIR"
echo "Local staging directory removed."

15.7 Restore the key pair on the rebuilt deploy VM

Copy the encrypted archive to /home/acllc/restore on the rebuilt VM. Replace the archive filename and expected SHA-256 with values from the manifest.

set -euo pipefail

ARCHIVE="$HOME/restore/PASTE_ARCHIVE_FILENAME_HERE.tar.gz.gpg"
EXPECTED_SHA256="PASTE_ARCHIVE_SHA256_FROM_MANIFEST"
KEY="$HOME/.ssh/id_ed25519_ansible"
PUBLIC_KEY="${KEY}.pub"

test -f "$ARCHIVE"

ACTUAL_SHA256="$(sha256sum "$ARCHIVE" | awk '{print $1}')"

if [ "$ACTUAL_SHA256" != "$EXPECTED_SHA256" ]; then
  echo "ERROR: Recovery archive SHA-256 mismatch."
  exit 1
fi

if [ -e "$KEY" ] || [ -e "$PUBLIC_KEY" ]; then
  echo "ERROR: Refusing to overwrite an existing automation key."
  exit 1
fi

RESTORE_DIR="$(mktemp -d)"
DERIVED_PUBLIC_KEY="$(mktemp)"
trap 'rm -rf "$RESTORE_DIR"; rm -f "$DERIVED_PUBLIC_KEY"' EXIT

gpg --decrypt "$ARCHIVE" |
tar --extract --gzip --directory="$RESTORE_DIR" --file=-

install -d -m 700 "$HOME/.ssh"
install -m 600 "$RESTORE_DIR/.ssh/id_ed25519_ansible" "$KEY"
install -m 644 "$RESTORE_DIR/.ssh/id_ed25519_ansible.pub" "$PUBLIC_KEY"
chown "$USER:$USER" "$KEY" "$PUBLIC_KEY"

ssh-keygen -y -P '' -f "$KEY" > "$DERIVED_PUBLIC_KEY"

diff -u   <(awk '{print $1, $2}' "$PUBLIC_KEY")   <(awk '{print $1, $2}' "$DERIVED_PUBLIC_KEY")

echo "Automation key restore: PASS"
ssh-keygen -lf "$PUBLIC_KEY" -E sha256

The restored fingerprint must exactly match the backup manifest.

15.8 Verify the restored key against the rebuilt targets

Confirm each target's new SSH host-key fingerprint through the Proxmox console before accepting it on the deploy VM.

set -euo pipefail

KEY="$HOME/.ssh/id_ed25519_ansible"
SSH_USER="acllc"

TARGETS=(
  "dev-web-01|192.168.8.120"
  "dev-app-01|192.168.8.6"
)

for entry in "${TARGETS[@]}"; do
  expected_hostname="${entry%%|*}"
  target_ip="${entry##*|}"

  ssh-keygen -f "$HOME/.ssh/known_hosts" -R "$target_ip" >/dev/null 2>&1 || true

  ssh     -i "$KEY"     -o IdentitiesOnly=yes     -o BatchMode=yes     -o ConnectTimeout=10     -o StrictHostKeyChecking=accept-new     "${SSH_USER}@${target_ip}"     bash -s -- "$expected_hostname" <<'REMOTE'
set -euo pipefail

expected_hostname="$1"

test "$(hostnamectl --static)" = "$expected_hostname"
sudo -n true
sudo docker info >/dev/null
REMOTE

  echo "PASS: ${expected_hostname} SSH, sudo, and Docker verified."
done

15.9 Republish the approved GitHub Environment secret

Run from a trusted host where GitHub CLI is authenticated and the restored private key exists. Repeat for each approved repository and Environment.

set -euo pipefail

KEY="$HOME/.ssh/id_ed25519_ansible"
GITHUB_REPOSITORY="fp-001-org/fp-gw-srvc-001"
GITHUB_ENVIRONMENT="dev"
SECRET_NAME="DEPLOY_SSH_PRIVATE_KEY"

command -v gh >/dev/null
gh auth status
test -f "$KEY"
chmod 600 "$KEY"

gh secret set   "$SECRET_NAME"   --repo "$GITHUB_REPOSITORY"   --env "$GITHUB_ENVIRONMENT"   < "$KEY"

gh secret list   --repo "$GITHUB_REPOSITORY"   --env "$GITHUB_ENVIRONMENT"
Recovery order: back up the old key → retain the public-key copy → rebuild the base template with that public key → create the new deploy VM → restore the private key only there → recreate target VMs → verify SSH → recreate approved DEPLOY_SSH_PRIVATE_KEY secrets.

16. 🔄 Reboot and Perform the Final Verification

COMMON · NO CHANGE NEEDEDCHANGE PER DEPLOYED VMCHANGE IN ROUTER / DHCPPerform one clean reboot and verify hostname, DHCP identity, core services, and automation tooling.
sudo reboot

After the VM returns:

hostnamectl --static
ip -brief address
ip link show ens18
ip route
resolvectl status

systemctl is-active qemu-guest-agent ssh docker containerd
docker ps

node --version
npm --version
aws --version
ansible --version

test -f "$HOME/.ssh/id_ed25519_ansible"
ssh-keygen -lf "$HOME/.ssh/id_ed25519_ansible.pub" -E sha256

17. Finished State

COMMON · NO CHANGE NEEDEDCHANGE PER DEPLOYED VMThe deployment VM is prepared as an independent clone with its own identity, router-managed address, automation toolchain, and recoverable SSH identity.
VM Name:
prod-terraform-deploy-01
Network Identity:
AA:BB:CC:01:01:02 → 192.168.8.57
Base Software:
Ubuntu 26.04, QEMU Guest Agent, OpenSSH, Docker Engine, Buildx, Compose, Node.js, AWS CLI v2, and Ansible
Automation Identity:
Private key active only on deploy VM; encrypted recovery copy stored off-host; public key installed in the base template

18. 🧾 Source Consistency Status

COMMON · NO CHANGE NEEDEDRecord the document changes and validation performed for this revision.
Document status: MODIFIED
Authoritative previous source: attached prod-terraform-deploy-01-prep MDX
Previous numbered sections: 1 through 17
Current numbered sections: 1 through 18

New recovery section:
- Section 15 backs up id_ed25519_ansible before the deploy VM is destroyed.
- The private key is stored only in an AES-256 GPG-encrypted off-host archive.
- A checksum manifest and public-key fingerprint are generated.
- The archive must pass a decrypt-and-key-match test before teardown.
- Windows PowerShell 7 copies the archive, manifest, and public key off-host.
- Only the public key is used while rebuilding the base template.
- Restore refuses to overwrite an existing key and verifies the restored fingerprint.
- Target SSH, passwordless sudo, and Docker access are revalidated.
- Approved DEPLOY_SSH_PRIVATE_KEY GitHub Environment secrets are recreated with gh.

Numbering changes:
- Original Section 15 moved to Section 16.
- Original Section 16 moved to Section 17.
- Original Section 17 moved to Section 18.

Validation:
- Front matter checked
- JSX/TSX syntax parsed
- Top-level numbering checked as 1 through 18
- Recovery subsection numbering checked as 15.1 through 15.9
- Embedded Bash blocks passed bash -n after rendering dynamic values
- PowerShell 7 blocks inspected for balanced syntax and variable flow
- Full Docusaurus build not run because the complete documentation repository was not supplied