Templates

Hidden Cost of Ad-Hoc SES Template Edits

Direct SES template edits — Console, CLI, or SDK — share the same failure modes: no diff, no approver, no rollback. What production-grade looks like.

A senior engineer at a Series B SaaS once described their Friday afternoon to me like this: support flagged that the trial-expiry email had a broken link. They opened the SES Console, edited the template, saved it, and went to the standup retro. Forty minutes later a different engineer rolled back what they thought was a stale change and overwrote the fix. Nobody knew until Monday, when a customer complained their second trial-expiry email still had the broken link.

There was no diff. No version history. No notification. No approver. No paper trail. Two engineers, both acting reasonably, both with the same IAM role, silently undid each other's work on a customer-facing email.

This is what ungated SES template editing looks like at the moment it stops scaling — Console clicks here, but the same dynamics hold whether you reached for aws ses update-template, a one-off boto3 script, or the Console UI. It doesn't fail loudly. It fails by accumulating small, invisible incidents until somebody finally counts them up.

The default workflow most teams start with

Almost every team I talk to begins the same way. Engineering needs to send a welcome email. Somebody opens the SES Console, pastes in HTML, clicks save, wires up SendTemplatedEmail from the application code, and ships it. The team has more important things to do than design a "template management strategy."

At small scale, this feels right. There's one welcome email. One person edits it. Everyone on the team has admin access to the AWS account anyway. The Console UI is right there. Why build process around something this simple?

The unstated assumptions are doing a lot of work. The team is assuming that:

One person will edit any given template at a time.
Whoever edits will not make a typo a customer would notice.
The template that exists in SES today is the template the team expects to exist.
Nobody outside engineering will ever need to change the copy.
An auditor will never ask "who changed this template, when, and why?"
Staging and production templates will stay aligned without active effort.

Each of those assumptions is reasonable in isolation. They fail in combination, and they fail silently.

What's actually missing

When you treat direct SES editing — Console, CLI, or SDK — as a workflow, here's the inventory of what you don't have.

No version history. SES has exactly one current version of a template by name. There is no ses:GetTemplateHistory. The previous version of welcome_email is gone the moment you save the new one.

No diff. Even if you have the previous version stashed somewhere, the Console does not show you what changed. A one-character edit to a Handlebars expression and a complete rewrite look the same: a "save" event in CloudTrail.

No review. There is no approver between "I edited this" and "this goes to customers." The publish surface and the editing surface are the same surface.

No structural audit. CloudTrail captures the API call. It does not capture, in any usable form, what the content of the template was before and after. If you need to answer "what did the email actually say on April 14?" you are reconstructing history from memory.

No environment separation by default. Templates in SES are scoped per AWS account and region. If you use one AWS account for everything — common at smaller companies — your dev, staging, and production templates share a namespace and an editing surface. The same Console click can target any of them.

No role separation. IAM is binary at the action level. A principal either has ses:UpdateTemplate or it doesn't. There is no native "editor that cannot publish" or "publisher that cannot edit." Most teams give devs both, which is the same as giving everyone the keys.

No notification. When a template changes, nobody is told. There is no pull request to subscribe to, no Slack channel to watch.

Each gap is small. Together they describe a system where customer-facing content can change without anyone noticing.

What about the CLI or the SDK?

None of these gaps are Console-specific. aws ses update-template from a developer's laptop has the same shape: no version history, no diff, no approver, no notification. A one-off boto3 script run during an incident is the same failure mode with a faster keyboard. A CI workflow that calls update-template without a review step before merge is the same again.

The Console is the most visible variant — it's the path anyone with AWS access can find their way to without a runbook, and it's the one non-engineers reach for first. But what we're really describing is the cost of direct template edits. The path doesn't matter. The absence of controls does.

If your team mostly edits via the CLI or a deploy script and you read the previous section thinking "we're fine, we don't touch the Console" — you have most of the same gaps. The next section's failure modes apply to your setup too.

The failure modes that actually bite

These are the real incidents I see, in rough order of frequency.

The hotfix that breaks a variable. During an incident, an engineer edits a template directly in the Console to fix a typo. They accidentally change {{customer_name}} to {{ customer_name}}. SES happily accepts the template. The next 4,000 emails go out reading "Hi ,". This is the most common Console failure I see, and it's catastrophic for password reset and trial-expiry emails specifically — the ones customers actually open.

The accidental delete. Someone is cleaning up old test templates. A name collision they didn't expect. A production template gets deleted. SES will return a clear error to your application, but only after the first send fails, which is also the first time anyone notices.

Drift between staging and production. Engineering ships a copy change to staging. QA approves. Three weeks later, somebody finally remembers to update production — or worse, doesn't. The two environments quietly diverge. Now staging is no longer a meaningful test of production behavior, but nobody trusts the fix until they've seen it in staging, so changes get slower and less safe at the same time.

Locale collisions. Teams that send in multiple languages typically use a naming convention like welcome_en / welcome_de / welcome_fr. Someone translates new copy and updates welcome_en thinking they were in welcome_de. The German customer base gets English copy until somebody complains.

Cross-environment sends from one account. A load test in staging accidentally targets a template name that exists in production. Or — worse — a single AWS account hosts both, and the test fires off real password resets to real customers. Companies have had to pay for breach notifications over this.

The well-intentioned override. Marketing or support asks engineering to "just update the copy." Engineering, busy, hands them Console access. The copy gets updated. The variable that engineering had defensively added two months ago, after a previous bug, gets removed because the marketer didn't know what it did. The bug returns three weeks later in a slightly different form.

The pattern across all of these is that nothing fails loudly. SES does what you told it to do. The application keeps sending. The pager doesn't go off. You discover the problem from a customer email, days later.

What "good" looks like

Production-grade template management has four properties. None of them is exotic. All of them require some structure outside the SES Console.

A source of truth that lives somewhere safer than SES. Whether that's a Git repository, an IaC stack, or a managed system, the canonical version of every template should exist somewhere with history, diffs, and access control. SES becomes a deploy target, not the database.

Review before publish. Changes to customer-facing content should be visible to at least one other human before they ship. The bar isn't "every comma reviewed by a committee." The bar is "no template change goes to production with exactly one set of eyes on it."

Scoped permissions that match how humans actually work. Engineers can edit. A smaller set can publish. A separate set can read. Marketers, lifecycle managers, and support people can edit copy without touching infrastructure. The mapping is to humans and roles, not to IAM principals attached to deploy machines.

A paper trail that an auditor will accept. Who changed it, when, why, what was the previous version, what was the new version. Linkable to a ticket or an incident. Exportable. Immutable.

If your current setup gives you all four, the rest of this post is academic. If it doesn't, the question is when to graduate from the Console — not whether.

When to graduate from Console-only management

Most teams overshoot the right moment. They live in the Console well past the point where it's costing them, because each individual incident is small enough to ignore. Here are the signals that say you're past it.

You can no longer tell, without checking, whether the template in production matches the template in staging.

You've had at least one incident where a template change broke a variable, deleted content, or shipped to the wrong audience.

A non-engineer needs to update transactional copy and you've been giving them either Console access (risky) or a Slack message to an engineer (slow).

An auditor or customer has asked for change control evidence on transactional email and you've had to assemble it by hand.

You manage more than ten templates, or you operate in more than two environments, or you send in more than one language. The combinatorics get away from you fast.

You've added a second engineer who has ever accidentally overwritten the first engineer's change.

Any one of these is a soft signal. Two or more is the moment to invest a couple of weeks in structural cleanup.

The options, honestly assessed

There are three serious paths beyond Console-only management. Each has real strengths and real costs. Ignore anyone who tells you one of them is universally correct.

Option A: Git + CI + the SES API

Treat templates as files in a Git repository. A CI workflow on merge calls aws ses update-template (or the v2 equivalent) for the right environment.

# .github/workflows/deploy-templates.yml
name: Deploy SES templates

on:
  push:
    branches: [main]
    paths: ["templates/**"]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.SES_DEPLOY_ROLE_ARN }}
          aws-region: eu-west-1
      - name: Deploy templates
        run: |
          for f in templates/*.json; do
            aws ses update-template --cli-input-json file://$f
          done

What you get: Real version history (Git). Real diffs. Pull request reviews. A clear deploy moment. Works with infrastructure you already have.

What it costs: Non-engineers can't reasonably contribute. Every copy change is a PR. You will build, by hand, the bits that aren't in Git: who can publish vs review, environment promotion, rollback ergonomics, drift detection between Git and what's actually in SES, audit log enrichment beyond commit metadata. None of these are hard individually. All of them together is a small internal product, and someone on your team owns its on-call.

This is a good fit when your team is small, fully technical, and your template volume is low. It scales worse than people expect once a marketer or PM legitimately needs to edit copy.

Option B: Infrastructure as Code (Terraform, CDK, SAM)

Define templates as IaC resources alongside the rest of your AWS infrastructure.

resource "aws_ses_template" "welcome" {
  name    = "welcome_${var.environment}"
  subject = "Welcome to Acme"
  html    = file("${path.module}/templates/welcome.html")
  text    = file("${path.module}/templates/welcome.txt")
}

What you get: Templates live next to identities, configuration sets, and IAM. Plan/apply gives you a deploy-time review surface. Drift detection comes for free in some cases.

What it costs: IaC tools assume infrastructure-shaped change cadence. Template copy doesn't move at infrastructure cadence — it moves at marketing cadence. Plan diffs become noisy. Rolling back a template change means a new commit and a new apply. Non-engineers are even further from the editing surface than in Option A. Some IaC tools have rough edges around SES specifically (resource churn on update, awkward handling of HTML files, no first-class versioning).

IaC is a good fit when your templates change rarely, when engineering is the only constituency that touches them, and when you already have a strong IaC culture. It's a poor fit when copy changes weekly or when non-engineers need a seat at the table.

Option C: A dedicated control layer for SES

A managed system that sits outside the email delivery path and adds structured template management, version history, audit logs, environment separation, role-based access, and safer publishing — without replacing SES or proxying email traffic.

What you get: The four properties from earlier (source of truth, review, scoped permissions, paper trail) without building them. Non-engineers can edit copy through a UI without ever touching the AWS Console or your CI. SES remains your sender of record, your reputation, your domain identities. The dedicated layer sits in front of the editing experience, not in front of the delivery path.

What it costs: It's another vendor in your stack. You need to be confident the access pattern is bounded — read templates, write templates, list versions, log activity — and that you can leave cleanly if you ever want to.

This is what Sovy is. Sovy is a control layer for Amazon SES templates: it brings versioning, audit, environment scoping, and role-based access on top of the SES API, while leaving SES itself as the system of record for delivery. It does not proxy email. It does not replace SES. It is not in the data path when your application calls SendTemplatedEmail. Its job is to make sure the template that gets there is the right one, that you know who put it there, and that you can roll it back when you need to.

If Option A or Option B fits your team, use it. If you're building the same set of primitives by hand for the third time, that's the signal that a dedicated layer earns its place.

A side-by-side view

Capability	Console only	Git + CI	IaC (TF / CDK)	Dedicated layer (e.g. Sovy)
Version history	No	Yes (Git)	Yes (Git)	Yes, native
Content-level diff in review	No	Yes (PR)	Yes (plan)	Yes, native
Approval before publish	No	Yes (PR review)	Yes (plan/apply)	Yes, role-based
Audit trail beyond CloudTrail	No	Partial (commit)	Partial (commit)	Yes, content-level
Environment separation	Manual	Manual in CI	Native	Native
Role separation (edit vs publish)	No (IAM only)	Manual	Manual	Native
Non-engineer editing	Risky	Hard	Harder	Designed for it
Drift detection vs SES	No	Build it	Partial	Yes, native
Owns on-call for the system	You	You	You	The vendor

The reframe

The argument for moving past the Console isn't really an argument about tools. It's an argument about category.

Transactional email is part of your production system. Password resets, trial expirations, billing notices, security alerts, invitations — these are not marketing artifacts. They are application output. They have the same uptime, correctness, auditability, and rollback requirements as any other production surface.

If somebody on your team proposed deploying a code change to production by typing it directly into the AWS Console at 4:55pm on a Friday, with no version control, no review, and no rollback, you'd push back. You'd push back without thinking about it.

That's the same change you're making every time you edit a template in the SES Console. The only difference is that the application code path doesn't fail. The customer experience does.

A pragmatic next step

You don't need to pick a tool today. You need to know whether you have the four properties: source of truth, review before publish, scoped permissions, and a paper trail. Spend twenty minutes this week walking through your last three template changes. For each one, write down who edited it, when, why, what changed, who reviewed it, and how you'd roll it back.

If you can answer all of those for all three changes from existing systems, your setup is fine — keep it.

If you can't, you've found the gap. Whether you close it with Git + CI, with IaC, or with a dedicated layer is the cheap part. Knowing the gap exists is the work.

Sovy is a control layer for Amazon SES templates. It adds version history, audit logs, environment separation, and role-based access on top of SES, while staying entirely outside the email delivery path. If you're building these primitives by hand and the cost is starting to show, we'd like to hear from you.