Terraform Deploy VM Preparation (prod-terraform-deploy-01)
1. 🧭 Scope Legend
Use these markers to distinguish fixed preparation steps, per-VM identity values, and router-side DHCP work.
2. 🧭 Network Design Decision
192.168.8.57 for MAC address AA:BB:CC:01:01:02. Ubuntu remains on DHCP, while the router centrally supplies the stable address, gateway, subnet, and DNS settings.3. 🧱 Manual Clone Inputs
4. 📋 Final VM Profile
9000 / tmplt-ub-26-min-base<<CHOOSE_UNUSED_VM_ID>> / prod-terraform-deploy-01Full CloneAA:BB:CC:01:01:02 → 192.168.8.572 cores · 4096 MiB RAM · 40 GiB diskslot = "scsi0", storage local-lvm, and size 40G or larger.5. 🔒 Create the Router DHCP Reservation Before First Boot
Name: prod-terraform-deploy-01
MAC address: AA:BB:CC:01:01:02
Reserved IPv4 address: 192.168.8.57- Confirm the address and MAC are unused.
- Keep the template MAC reserved only for the template-build identity.
- Do not start the VM before the reservation exists.
6. 🧬 Create a Manual Full Clone in Proxmox
- Select template tmplt-ub-26-min-base, VM ID 9000.
- Select Clone.
- Use target node pve, VM ID <<CHOOSE_UNUSED_VM_ID>>, and name prod-terraform-deploy-01.
- Select Full Clone and storage local-lvm.
- Wait for the clone task to complete.
- Do not start the clone yet.
7. ⚙️ Configure the Cloned VM Before First Boot
- Bridge: vmbr0
- NIC model: VirtIO
- MAC: AA:BB:CC:01:01:02
- CPU: 2 cores
- Memory: 4096 MiB
- Disk: at least 40 GiB
- QEMU Guest Agent: enabled
8. 🚀 Start the VM and Verify the DHCP Reservation
ip -brief link
ip -brief address
ip route
resolvectl status
netplan status --all 2>/dev/null || true
DEFAULT_GATEWAY="$(ip route show default | awk '{print $3; exit}')"
test -n "$DEFAULT_GATEWAY"
ping -c 4 "$DEFAULT_GATEWAY"
getent hosts ubuntu.com
curl -I https://download.docker.com9. 🏷️ Set the Permanent Ubuntu Hostname
sudo hostnamectl set-hostname prod-terraform-deploy-01
if grep -qE '^127\.0\.1\.1[[:space:]]+' /etc/hosts; then
sudo sed -i -E 's/^127\.0\.1\.1[[:space:]].*/127.0.1.1 prod-terraform-deploy-01/' /etc/hosts
else
echo '127.0.1.1 prod-terraform-deploy-01' | sudo tee -a /etc/hosts
fi
hostnamectl --static
grep -E '^127\.0\.1\.1[[:space:]]+' /etc/hosts10. 🪪 Verify Clone-Specific Machine Identity and SSH Host Keys
cat /etc/machine-id
for key in /etc/ssh/ssh_host_*_key.pub; do
sudo ssh-keygen -lf "$key" -E sha256
doneIf the host keys are unexpectedly missing, repair them with:
sudo ssh-keygen -A
sudo systemctl restart ssh
ls -l /etc/ssh/ssh_host_*11. 🐳 Verify the Base Template Software
systemctl is-active qemu-guest-agent
systemctl is-active ssh
systemctl is-active docker
systemctl is-active containerd
docker --version
docker compose version
docker buildx version
containerd --version
docker ps12. 🟢 Install Node.js for GitHub Actions
sudo apt update
sudo apt install -y ca-certificates curl gnupg
curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash -
sudo apt install -y nodejs
node --version
npm --version13. ☁️ Install AWS CLI v2 for Terraform State Automation
sudo apt update
sudo apt install -y ca-certificates curl unzip
set -euo pipefail
case "$(uname -m)" in
x86_64)
aws_cli_url="https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip"
;;
aarch64|arm64)
aws_cli_url="https://awscli.amazonaws.com/awscli-exe-linux-aarch64.zip"
;;
*)
echo "ERROR: Unsupported architecture: $(uname -m)"
exit 1
;;
esac
work_dir="$(mktemp -d)"
trap 'rm -rf "$work_dir"' EXIT
curl --fail --location --silent --show-error "$aws_cli_url" --output "$work_dir/awscliv2.zip"
unzip -q "$work_dir/awscliv2.zip" -d "$work_dir"
if command -v aws >/dev/null 2>&1; then
sudo "$work_dir/aws/install" --update --install-dir /usr/local/aws-cli --bin-dir /usr/local/bin
else
sudo "$work_dir/aws/install" --install-dir /usr/local/aws-cli --bin-dir /usr/local/bin
fi
aws --version
sudo -u acllc -H bash -lc 'command -v aws && aws --version'aws configure. Workflows must use GitHub OIDC and temporary role credentials.14. ⚙️ Install Ansible and Create the Automation SSH Identity
sudo apt update
sudo apt install -y software-properties-common ca-certificates curl gnupg
sudo add-apt-repository --yes --update ppa:ansible/ansible
sudo apt install -y ansible
ansible --version
mkdir -p ~/ansible/{inventories,playbooks,roles,group_vars,host_vars}14.1 Create or retain the dedicated automation key
mkdir -p "$HOME/.ssh"
chmod 700 "$HOME/.ssh"
if [ ! -f "$HOME/.ssh/id_ed25519_ansible" ]; then
ssh-keygen -t ed25519 -f "$HOME/.ssh/id_ed25519_ansible" -C "prod-terraform-deploy-01 Ansible automation" -N ""
else
echo "Automation key already exists."
fi
chmod 600 "$HOME/.ssh/id_ed25519_ansible"
chmod 644 "$HOME/.ssh/id_ed25519_ansible.pub"
ssh-keygen -lf "$HOME/.ssh/id_ed25519_ansible.pub" -E sha256
cat "$HOME/.ssh/id_ed25519_ansible.pub"authorized_keys.14.2 Create an inventory placeholder and verify Ansible
cat > "$HOME/ansible/inventories/dev.ini" <<'EOF'
[web]
# dev-web-01 ansible_host=192.168.8.120 ansible_user=acllc ansible_ssh_private_key_file=~/.ssh/id_ed25519_ansible
[app]
# dev-app-01 ansible_host=192.168.8.6 ansible_user=acllc ansible_ssh_private_key_file=~/.ssh/id_ed25519_ansible
[db]
[k8s]
[runner]
EOF
ansible localhost -m ping -c local15. 🛟 Back Up, Restore, and Republish the Automation SSH Identity
/home/acllc/.ssh/id_ed25519_ansible on the current deploy VM. Back it up before deleting that VM.15.1 Verify that the private and public keys match
set -euo pipefail
KEY="$HOME/.ssh/id_ed25519_ansible"
PUBLIC_KEY="${KEY}.pub"
DERIVED_PUBLIC_KEY="$(mktemp)"
trap 'rm -f "$DERIVED_PUBLIC_KEY"' EXIT
test -f "$KEY"
test -f "$PUBLIC_KEY"
chmod 700 "$HOME/.ssh"
chmod 600 "$KEY"
chmod 644 "$PUBLIC_KEY"
ssh-keygen -y -P '' -f "$KEY" > "$DERIVED_PUBLIC_KEY"
diff -u <(awk '{print $1, $2}' "$PUBLIC_KEY") <(awk '{print $1, $2}' "$DERIVED_PUBLIC_KEY")
echo "Private/public key match: PASS"
ssh-keygen -lf "$PUBLIC_KEY" -E sha256Record the displayed SHA-256 fingerprint in the recovery record or password manager.
15.2 Create an AES-256 encrypted recovery archive
set -euo pipefail
KEY="$HOME/.ssh/id_ed25519_ansible"
PUBLIC_KEY="${KEY}.pub"
STAGING_DIR="$HOME/secure-key-backup-staging"
TIMESTAMP="$(date -u +%Y%m%dT%H%M%SZ)"
SOURCE_HOST="$(hostnamectl --static)"
ARCHIVE_NAME="${SOURCE_HOST}-id_ed25519_ansible-${TIMESTAMP}.tar.gz.gpg"
ARCHIVE_PATH="${STAGING_DIR}/${ARCHIVE_NAME}"
MANIFEST_PATH="${ARCHIVE_PATH}.manifest.txt"
PUBLIC_COPY_PATH="${STAGING_DIR}/${SOURCE_HOST}-id_ed25519_ansible.pub"
command -v gpg >/dev/null
test -f "$KEY"
test -f "$PUBLIC_KEY"
umask 077
install -d -m 700 "$STAGING_DIR"
tar --create --gzip --directory="$HOME" --file=- .ssh/id_ed25519_ansible .ssh/id_ed25519_ansible.pub |
gpg --symmetric --cipher-algo AES256 --output "$ARCHIVE_PATH"
install -m 0644 "$PUBLIC_KEY" "$PUBLIC_COPY_PATH"
ARCHIVE_SHA256="$(sha256sum "$ARCHIVE_PATH" | awk '{print $1}')"
PUBLIC_FINGERPRINT="$(ssh-keygen -lf "$PUBLIC_KEY" -E sha256 | awk '{print $2}')"
{
echo "source_hostname=${SOURCE_HOST}"
echo "created_utc=${TIMESTAMP}"
echo "archive_file=${ARCHIVE_NAME}"
echo "archive_sha256=${ARCHIVE_SHA256}"
echo "public_key_fingerprint=${PUBLIC_FINGERPRINT}"
} > "$MANIFEST_PATH"
chmod 600 "$ARCHIVE_PATH" "$MANIFEST_PATH"
chmod 644 "$PUBLIC_COPY_PATH"
echo "Encrypted archive: $ARCHIVE_PATH"
echo "Manifest: $MANIFEST_PATH"
echo "Public-key copy: $PUBLIC_COPY_PATH"
cat "$MANIFEST_PATH"Use a strong GPG passphrase and store it in a password manager separate from the backup files.
15.3 Test decryption before destroying the old VM
set -euo pipefail
STAGING_DIR="$HOME/secure-key-backup-staging"
ARCHIVE_PATH="$(find "$STAGING_DIR" -maxdepth 1 -type f -name '*.tar.gz.gpg' -print -quit)"
VERIFY_DIR="$(mktemp -d)"
trap 'rm -rf "$VERIFY_DIR"' EXIT
test -n "$ARCHIVE_PATH"
gpg --decrypt "$ARCHIVE_PATH" |
tar --extract --gzip --directory="$VERIFY_DIR" --file=-
test -f "$VERIFY_DIR/.ssh/id_ed25519_ansible"
test -f "$VERIFY_DIR/.ssh/id_ed25519_ansible.pub"
ssh-keygen -y -P '' -f "$VERIFY_DIR/.ssh/id_ed25519_ansible" > "$VERIFY_DIR/derived.pub"
diff -u <(awk '{print $1, $2}' "$VERIFY_DIR/.ssh/id_ed25519_ansible.pub") <(awk '{print $1, $2}' "$VERIFY_DIR/derived.pub")
echo "Encrypted archive restore test: PASS"
ssh-keygen -lf "$VERIFY_DIR/.ssh/id_ed25519_ansible.pub" -E sha25615.4 Copy the recovery package off-host with Windows PowerShell 7
$ErrorActionPreference = "Stop"
Set-StrictMode -Version Latest
$RemoteUser = "acllc"
$DeployVmIp = "192.168.8.57"
$RemoteDirectory = "/home/acllc/secure-key-backup-staging"
$Destination = "D:\secure-backups\prod-terraform-deploy-01\ssh-identity"
New-Item -ItemType Directory -Path $Destination -Force | Out-Null
$RemoteSource = '{0}@{1}:{2}/*' -f $RemoteUser, $DeployVmIp, $RemoteDirectory
scp $RemoteSource $Destination
if ($LASTEXITCODE -ne 0) {
throw "scp failed with exit code $LASTEXITCODE."
}
$Manifest = Get-ChildItem -LiteralPath $Destination -Filter "*.manifest.txt" |
Sort-Object LastWriteTimeUtc -Descending |
Select-Object -First 1
if (-not $Manifest) {
throw "The recovery manifest was not copied."
}
$ManifestLines = Get-Content -LiteralPath $Manifest.FullName
$ArchiveName = (
$ManifestLines |
Where-Object { $_ -like "archive_file=*" } |
Select-Object -First 1
).Substring("archive_file=".Length)
$ExpectedSha256 = (
$ManifestLines |
Where-Object { $_ -like "archive_sha256=*" } |
Select-Object -First 1
).Substring("archive_sha256=".Length)
$ArchivePath = Join-Path $Destination $ArchiveName
$ActualSha256 = (Get-FileHash -LiteralPath $ArchivePath -Algorithm SHA256).Hash.ToLowerInvariant()
if ($ActualSha256 -ne $ExpectedSha256.ToLowerInvariant()) {
throw "SHA-256 verification failed for $ArchivePath"
}
Write-Host "Off-host archive verification: PASS"
Write-Host "Archive: $ArchivePath"
Write-Host "SHA-256: $ActualSha256"
Get-Content -LiteralPath $Manifest.FullName15.5 Use only the backed-up public key while rebuilding the base template
Open the copied *-id_ed25519_ansible.pub file from the trusted backup and use that complete single-line value in the base-template preparation page. Do not decrypt or copy the private key into the template.
$BackupDirectory = "D:\secure-backups\prod-terraform-deploy-01\ssh-identity"
Get-ChildItem -LiteralPath $BackupDirectory -Filter "*-id_ed25519_ansible.pub" |
Sort-Object LastWriteTimeUtc -Descending |
Select-Object -First 1 |
Get-Content15.6 Remove local staging only after the off-host copy is verified
set -euo pipefail
STAGING_DIR="$HOME/secure-key-backup-staging"
test -d "$STAGING_DIR"
find "$STAGING_DIR" -maxdepth 1 -type f -print
rm -rf -- "$STAGING_DIR"
test ! -e "$STAGING_DIR"
echo "Local staging directory removed."15.7 Restore the key pair on the rebuilt deploy VM
Copy the encrypted archive to /home/acllc/restore on the rebuilt VM. Replace the archive filename and expected SHA-256 with values from the manifest.
set -euo pipefail
ARCHIVE="$HOME/restore/PASTE_ARCHIVE_FILENAME_HERE.tar.gz.gpg"
EXPECTED_SHA256="PASTE_ARCHIVE_SHA256_FROM_MANIFEST"
KEY="$HOME/.ssh/id_ed25519_ansible"
PUBLIC_KEY="${KEY}.pub"
test -f "$ARCHIVE"
ACTUAL_SHA256="$(sha256sum "$ARCHIVE" | awk '{print $1}')"
if [ "$ACTUAL_SHA256" != "$EXPECTED_SHA256" ]; then
echo "ERROR: Recovery archive SHA-256 mismatch."
exit 1
fi
if [ -e "$KEY" ] || [ -e "$PUBLIC_KEY" ]; then
echo "ERROR: Refusing to overwrite an existing automation key."
exit 1
fi
RESTORE_DIR="$(mktemp -d)"
DERIVED_PUBLIC_KEY="$(mktemp)"
trap 'rm -rf "$RESTORE_DIR"; rm -f "$DERIVED_PUBLIC_KEY"' EXIT
gpg --decrypt "$ARCHIVE" |
tar --extract --gzip --directory="$RESTORE_DIR" --file=-
install -d -m 700 "$HOME/.ssh"
install -m 600 "$RESTORE_DIR/.ssh/id_ed25519_ansible" "$KEY"
install -m 644 "$RESTORE_DIR/.ssh/id_ed25519_ansible.pub" "$PUBLIC_KEY"
chown "$USER:$USER" "$KEY" "$PUBLIC_KEY"
ssh-keygen -y -P '' -f "$KEY" > "$DERIVED_PUBLIC_KEY"
diff -u <(awk '{print $1, $2}' "$PUBLIC_KEY") <(awk '{print $1, $2}' "$DERIVED_PUBLIC_KEY")
echo "Automation key restore: PASS"
ssh-keygen -lf "$PUBLIC_KEY" -E sha256The restored fingerprint must exactly match the backup manifest.
15.8 Verify the restored key against the rebuilt targets
Confirm each target's new SSH host-key fingerprint through the Proxmox console before accepting it on the deploy VM.
set -euo pipefail
KEY="$HOME/.ssh/id_ed25519_ansible"
SSH_USER="acllc"
TARGETS=(
"dev-web-01|192.168.8.120"
"dev-app-01|192.168.8.6"
)
for entry in "${TARGETS[@]}"; do
expected_hostname="${entry%%|*}"
target_ip="${entry##*|}"
ssh-keygen -f "$HOME/.ssh/known_hosts" -R "$target_ip" >/dev/null 2>&1 || true
ssh -i "$KEY" -o IdentitiesOnly=yes -o BatchMode=yes -o ConnectTimeout=10 -o StrictHostKeyChecking=accept-new "${SSH_USER}@${target_ip}" bash -s -- "$expected_hostname" <<'REMOTE'
set -euo pipefail
expected_hostname="$1"
test "$(hostnamectl --static)" = "$expected_hostname"
sudo -n true
sudo docker info >/dev/null
REMOTE
echo "PASS: ${expected_hostname} SSH, sudo, and Docker verified."
done15.9 Republish the approved GitHub Environment secret
Run from a trusted host where GitHub CLI is authenticated and the restored private key exists. Repeat for each approved repository and Environment.
set -euo pipefail
KEY="$HOME/.ssh/id_ed25519_ansible"
GITHUB_REPOSITORY="fp-001-org/fp-gw-srvc-001"
GITHUB_ENVIRONMENT="dev"
SECRET_NAME="DEPLOY_SSH_PRIVATE_KEY"
command -v gh >/dev/null
gh auth status
test -f "$KEY"
chmod 600 "$KEY"
gh secret set "$SECRET_NAME" --repo "$GITHUB_REPOSITORY" --env "$GITHUB_ENVIRONMENT" < "$KEY"
gh secret list --repo "$GITHUB_REPOSITORY" --env "$GITHUB_ENVIRONMENT"DEPLOY_SSH_PRIVATE_KEY secrets.16. 🔄 Reboot and Perform the Final Verification
sudo rebootAfter the VM returns:
hostnamectl --static
ip -brief address
ip link show ens18
ip route
resolvectl status
systemctl is-active qemu-guest-agent ssh docker containerd
docker ps
node --version
npm --version
aws --version
ansible --version
test -f "$HOME/.ssh/id_ed25519_ansible"
ssh-keygen -lf "$HOME/.ssh/id_ed25519_ansible.pub" -E sha25617. ✅ Finished State
prod-terraform-deploy-01AA:BB:CC:01:01:02 → 192.168.8.57Ubuntu 26.04, QEMU Guest Agent, OpenSSH, Docker Engine, Buildx, Compose, Node.js, AWS CLI v2, and AnsiblePrivate key active only on deploy VM; encrypted recovery copy stored off-host; public key installed in the base template18. 🧾 Source Consistency Status
Document status: MODIFIED
Authoritative previous source: attached prod-terraform-deploy-01-prep MDX
Previous numbered sections: 1 through 17
Current numbered sections: 1 through 18
New recovery section:
- Section 15 backs up id_ed25519_ansible before the deploy VM is destroyed.
- The private key is stored only in an AES-256 GPG-encrypted off-host archive.
- A checksum manifest and public-key fingerprint are generated.
- The archive must pass a decrypt-and-key-match test before teardown.
- Windows PowerShell 7 copies the archive, manifest, and public key off-host.
- Only the public key is used while rebuilding the base template.
- Restore refuses to overwrite an existing key and verifies the restored fingerprint.
- Target SSH, passwordless sudo, and Docker access are revalidated.
- Approved DEPLOY_SSH_PRIVATE_KEY GitHub Environment secrets are recreated with gh.
Numbering changes:
- Original Section 15 moved to Section 16.
- Original Section 16 moved to Section 17.
- Original Section 17 moved to Section 18.
Validation:
- Front matter checked
- JSX/TSX syntax parsed
- Top-level numbering checked as 1 through 18
- Recovery subsection numbering checked as 15.1 through 15.9
- Embedded Bash blocks passed bash -n after rendering dynamic values
- PowerShell 7 blocks inspected for balanced syntax and variable flow
- Full Docusaurus build not run because the complete documentation repository was not supplied