Skip to main navigation Skip to main content Skip to page footer

DKIM Rotation Is Not Just a DNS Ticket: Why Mail Security Needs Automation

In the previous blog post, "Can some newsletter tool send email as your CEO? Probably yes", Jens explained a risk that is often underestimated: when an external provider receives DKIM signing rights for a primary domain, that provider can technically sign valid email for that domain.

I want to take this topic one step further. Jens' article already answers the policy question: DKIM keys for primary domains should not be handed over to external providers without control.

In real operations, however, the next question appears immediately:

How do we operate DKIM properly when we do not manage just one domain, but many domains, perhaps even hundreds, across several mail gateways and regular key rotations?

This is because DKIM keys are cryptographic keys and should (according to BSI TR-02102, must!) be rotated regularly. However, unlike web certificates, DKIM keys do not have a built-in expiration date, so this requirement is often overlooked.

This is where DKIM stops being a simple TXT record and becomes an operational process.

The Problem With Manual DKIM Rotation

On paper, DKIM rotation looks simple:

  1. generate a new private key
  2. publish the new public key in DNS
  3. switch the mail gateway to the new selector
  4. send test messages
  5. remove the old selector after a transition period

 

For a single domain, this is still manageable. In a grown infrastructure, it quickly becomes more complicated.

There may be multiple domains, subdomains, internal and external sending paths, different selectors, several Postfix gateways, different environments, and DNS zones that have grown over many years. Some domains are actively used every day. Others only exist for special applications or legacy processes. Some DKIM records belong to internal gateways, others to applications or older integrations.

When such an environment is rotated manually, the operational pressure becomes visible very quickly. Typical mistakes include:

  • a selector is published in DNS but not deployed on every gateway
  • a private key is temporarily stored unencrypted on an admin workstation
  • one gateway still signs with the old selector
  • a DNS record is removed too early
  • KeyTable and SigningTable no longer match
  • one domain is forgotten during the rotation
  • file permissions on the mail gateway are too open
  • only one test message is checked, while other sending paths remain untested

The problem is usually not DKIM itself. The problem is the manual operation around it.

Why Rotation Matters

Private DKIM keys are signing keys. If such a key is leaked, copied, forgotten, or left unchanged for years, an important part of email authentication is no longer under proper control.

DKIM keys should therefore be treated like other cryptographic keys:

  • they need a clear owner
  • they must not be stored unencrypted in tickets or wiki pages
  • they should not remain on personal workstations
  • they need strict file permissions on the gateways
  • their usage must be traceable
  • their rotation must be planned and testable

A rotation must not only be technically correct. It must also be operationally safe. Email is business-critical for many processes. A broken DKIM rotation can cause legitimate messages to be rated poorly or rejected by large providers. That does not only affect the credibility of a company; it can also interrupt sales, support, invoicing, and other communication paths where delays quickly become expensive.

What Has to Match Technically

DKIM always has two sides that must fit together.

The public key is published in DNS:

selector2026._domainkey.example.com IN TXT “v=DKIM1; k=rsa; p=<PUBLIC_KEY>”

The private key is stored on the mail gateway. The DKIM configuration then decides which domain is signed with which selector.

In a typical Postfix and DKIM milter environment, the configuration may look roughly like this:

example.com selector2026._domainkey.example.com

and:

selector2026._domainkey.example.com example.com:selector2026:/etc/dkim/keys/example.com/selector2026.private

(These are examples about how to configure dkimpy-milter, which is the DKIM signing daemon we have been using for years)

For a rotation to work, DNS, private key, selector, SigningTable, KeyTable, file permissions, and milter configuration must all match. This is exactly why the process becomes fragile when it is maintained manually per domain and per server.

Why Ansible Fits This Use Case

Ansible is a very good fit for DKIM rotation because it provides the properties this process needs:

  • idempotent execution
  • clear inventories for multiple gateways
  • templates for repeatable configuration
  • controlled rollouts across groups and stages
  • traceability through a Git-based workflow
  • secure handling of sensitive variables with Ansible Vault
  • simple validation and handler logic

The most important point for me is this: with Ansible, we describe the desired state.

Not:

I quickly copy this key to three servers and then edit two tables by hand.

But:

These domains should sign with these selectors. These private keys belong to them. These files must exist with these permissions. These services must be reloaded afterwards. These validation checks must pass.

That is a very different way of operating mail infrastructure.

Describing Multiple Domains Centrally

A simple DKIM data model can look like this:

dkim_domains:
- domain: example.com
selector: selector2026
key_path: /etc/dkim/keys/example.com/selector2026.private
- domain: news.example.com
selector: selector2026
key_path: /etc/dkim/keys/news.example.com/selector2026.private
- domain: app.example.net
selector: selector2026
key_path: /etc/dkim/keys/app.example.net/selector2026.private

From this central definition, Ansible can generate and deploy the required pieces:

  • domain key directories
  • private DKIM keys
  • SigningTable
  • KeyTable
  • TrustedHosts, if needed
  • correct file permissions
  • reload of the DKIM milter

When another domain is added later, it is not manually inserted on every gateway. It is added to the data model, reviewed, and rolled out.

Protecting Private DKIM Keys With Ansible Vault

Private DKIM keys must not be stored in plain text in a repository. If they are managed with Ansible, they should be stored in encrypted Vault files or retrieved from a dedicated secret management system.

With Ansible Vault, the logical content can look like this before encryption:

vault_dkim_private_keys: 
example.com:
selector2026: | -----BEGIN PRIVATE KEY----- <REDACTED_PRIVATE_KEY> -----END PRIVATE KEY-----
news.example.com:
selector2026: | -----BEGIN PRIVATE KEY----- <REDACTED_PRIVATE_KEY> -----END PRIVATE KEY-----

In Git, this file is stored encrypted, not as plain text. That is important, but it is not the whole security story.

Even with Ansible Vault, some rules are required:

  • the Vault password must not be stored in the repository
  • access to the Vault password must be restricted
  • CI/CD logs must never expose secrets
  • sensitive tasks should use no_log: true
  • private keys on target systems should have minimal file permissions
  • old keys must be removed in a controlled way after the transition period

A task that deploys a private DKIM key should therefore not only write the content, but also enforce ownership and permissions:

- name: Deploy DKIM private keys
ansible.builtin.copy:
content: “{{ vault_dkim_private_keys[item.domain][item.selector] }}”
dest: "{{ item.key_path }}"
owner: opendkim
group: opendkim
mode: "0600"
loop: "{{ dkim_domains }}"
no_log: true
notify: Reload dkim service

The interesting part is not just the copy module. The interesting part is that the process becomes reproducible. Every key is written to the expected path, with the expected permissions, on the expected group of mail gateways.

Generating Configuration From Templates

SigningTable and KeyTable should not be edited manually either. They can be generated from the same domain list. This is where Jinja templates become useful.

SigningTable:

{% for item in dkim_domains %} *@{{ item.domain }} {{ item.selector }}._domainkey.{{ item.domain }} {% endfor %} 

KeyTable:

{% for item in dkim_domains %} {{ item.selector }}._domainkey.{{ item.domain }} {{ item.domain }}:{{ item.selector }}:{{ item.key_path }} {% endfor %}

This prevents a common operational problem: a domain exists in the SigningTable, but the matching KeyTable entry is missing or points to the wrong key. Both files are generated from the same source of truth.

That sounds simple, but small inconsistencies like this are often the reason for long troubleshooting sessions.

A Safe DKIM Rotation Flow

 

A clean DKIM rotation should not start by overwriting the old key. A safer approach is to run a new selector in parallel.

One possible flow:

  1. define a new selector, for example selector2026
  2. generate a new private and public key pair
  3. store the private key encrypted with Ansible Vault
  4. publish the public key in DNS
  5. verify DNS propagation
  6. configure the mail gateways with Ansible to use the new selector
  7. send test messages and inspect the DKIM signature
  8. verify DMARC alignment
  9. monitor logs and bounces
  10. remove the old selector only after a transition period

The parallel phase is important. Receiving mail servers may process messages with delay, queues may still contain older messages, and DNS caches may still hold previous data. If the old selector is removed too early, otherwise valid signatures can become unverifiable.

DNS Remains Part of the Process

Ansible can manage the mail gateways very well. But the DNS part must be integrated with the same level of care.

Depending on the environment, the public DKIM record can be managed through BlueCat, a DNS API, Terraform, or a controlled change process. The important point is that the DNS record exists and is externally resolvable before the gateway starts signing with the new selector.

Typical checks:

dig TXT selector2026._domainkey.example.com
dig TXT _dmarc.example.com
dig MX example.com

After sending a test message, the message headers should be checked:

DKIM-Signature: v=1; a=rsa-sha256; d=example.com; s=selector2026; 
Authentication-Results: ... dkim=pass ... dmarc=pass

Only when DNS, the DKIM signature, and DMARC alignment match is the rotation really complete.

Why This Matters More With Multiple Domains

The real value of automation becomes visible not with one domain, but with many.

If ten or twenty domains must be rotated, a manual process quickly becomes hard to control. Each domain needs a matching DNS record, a private key, a selector, table entries, and tests.

With Ansible, the same role can run for all domains. The differences are expressed in variables. This reduces both effort and risk.

For example, the rollout can first target a test group:

mailgateways_test

then a small production canary group:

mailgateways_canary

and finally all production gateways:

mailgateways_prod

This turns DKIM rotation from a manual night-time maintenance task into a controlled deployment.

What I Would Always Validate

For DKIM rotation, it is not enough that Ansible finishes without errors. The result must be validated technically.

At minimum, I would check:

  • does the new DKIM TXT record exist in DNS?
  • is the new selector used in outgoing messages?
  • does the d= value match the expected domain?
  • does DKIM pass at external receivers?
  • does DMARC pass?
  • are there errors in the DKIM milter logs?
  • are there bounces or rejects after the change?
  • are old selectors removed after the transition period?
  • are private keys stored with 0600 and the correct owner on the gateway?

Especially in mail infrastructure, "deployment successful" is not the same as "mail is accepted correctly".

Why Ansible Is Better Than a Manual Runbook

A good runbook is important. But for DKIM rotation, a purely manual runbook reaches its limit quickly.

A runbook says what should be done. Ansible makes the process repeatable.

This brings several advantages:

  • fewer copy-and-paste errors
  • consistent configuration across multiple gateways
  • one central definition for all domains
  • encrypted storage of private keys
  • review before rollout
  • reproducible deployments
  • easier rollback to the previous selector
  • better traceability for audits

In my view, Ansible fits this topic very well because DKIM rotation sits exactly between classic system administration and a security process. It involves files, templates, services, permissions, secrets, and controlled changes across multiple Linux systems. That is one of Ansible's strengths.

Conclusion

DKIM is not just a TXT record. DKIM rotation is not a manual copy-and-paste task.

As soon as multiple domains and multiple mail gateways are involved, the process needs structure. The public key must be correct in DNS, the private key must be stored safely, the milter configuration must be consistent, and the change must be validated.

Ansible is a strong tool for this because it turns DKIM rotation into a repeatable operational process. Together with Ansible Vault, private keys can be stored encrypted and rolled out to mail gateways in a controlled way.

The most important point, however, is not the tool itself. The important point is that DKIM keys are treated as real security objects: with ownership, rotation, access control, validation, and clean removal.

Operating DKIM this way not only reduces delivery problems. It also gives an organization more control over who is allowed to sign mail on behalf of a domain.

While writing this, we can already see that large providers are becoming stricter with email authentication. Relying on SPF alone is no longer a comfortable position. In many real-world cases, mail delivery depends on SPF, DKIM, DMARC alignment, reputation, and consistent DNS configuration working together.

This also fits into the broader discussion around resilient infrastructure and digital sovereignty in Europe. If an organization wants to scale its mail infrastructure safely, the answer is not only another DNS record. The answer is a process that can be reviewed, repeated, automated, and validated.

If you are looking at DKIM rotation, mail authentication, DNS automation, or infrastructure automation in general, I am always open for a technical discussion.