Mohmine

Refactoring Infrastructure-as-Code (IaC) is inevitable as platforms mature. What starts as a monolithic "everything-in-one" stack eventually needs to be decomposed into logical layers, separating application layer from infrastructure layer, for instance.

For stateless resources like Lambda functions or SQS queues, this is straightforward: delete and recreate. But stateful resources (DynamoDB tables, RDS databases, S3 buckets) are a different story. One wrong move and terabytes of data vanish.

Manual operations and human errors are common in infrastructure migrations, especially when working under pressure.

This scenario happened. During a migration attempt, tables were successfully detached from the source stack using DeletionPolicy: Retain. But instead of importing them into the target stack, new tables were created from scratch. The result: complete loss of historical data, leaving only the delta (new data created after the migration by the users). This triggered a data recovery process to restore years of metadata.

This guide walks through the correct "safe path" for migrating DynamoDB tables between CloudFormation stacks using DeletionPolicies and Resource Imports, ensuring data and jobs remain intact.

Why these tables matter: the platform context

In modern data platforms, DynamoDB tables serve as the metadata backbone for managing the entire lakehouse infrastructure. These tables store critical information about:

Data schemas: table structures, column definitions, and data types for datasets across the platform
Data flows and jobs: pipeline metadata, job execution history, dependencies, and orchestration configurations
File lifecycle management: metadata used by automated cleaning processes to determine which files can be safely deleted or archived

These tables act as reference repositories that power cost optimization strategies. When a file cleaning job runs, it queries these DynamoDB tables to understand:

Which datasets are still active
What retention policies apply
Which files are candidates for deletion or transition to cheaper storage tiers (S3 Glacier, for example)

The stakes are high: losing these tables doesn't just mean losing metadata. It means losing the knowledge of what data exists, how it's structured, and what can be safely cleaned up. This could result in either catastrophic data deletion (cleaning too aggressively without the reference data).

When historical metadata is lost, data teams must reconstruct it manually, scanning terabytes of files to rebuild schema registries, job histories, and retention policies.

The critical mistake: detach and recreate

Here's what went wrong in the incident that inspired this guide:

Phase 1 (correct): tables were properly detached from the source stack using DeletionPolicy: Retain. The physical DynamoDB tables remained in the AWS account, untouched and containing all historical data.

Phase 2 (the mistake): instead of importing these existing tables into the target stack, the target stack was deployed with fresh table definitions. CloudFormation created brand new, empty tables from a new stack definition with the same names, after deleting the orphaned tables maintained by the previous stack (fortunately there was a backup).

The result:

All historical metadata: gone
Schema definitions from the past X years: gone
Job execution history: gone
Only new data created after the migration was captured

The recovery (one afternoon):

Restore tables from backups
Restore underlying S3 data if it was also deleted

This disaster was entirely preventable with CloudFormation's resource import feature.

DynamoDB backup safety net

Keep in mind that whenever a DynamoDB table is deleted, AWS offers the option to create a backup before deletion. If this scenario happens, the situation can be recovered relatively quickly by:

Restoring the DynamoDB table from the backup
Restoring the underlying S3 data (including versioned objects) if file lifecycle jobs deleted data based on missing metadata
Reconnecting the restored tables to the new stack using the import process described below

However, relying on backups means downtime, potential data loss between backup and deletion, and significant recovery effort. The better approach is to prevent the deletion in the first place.

Otherwise, here's the correct solution : CloudFormation resource imports.

CloudFormation resource imports

AWS CloudFormation's resource import feature allows bringing existing resources under stack management without recreating them. Instead of the risky "detach and recreate" approach, it tells the new stack: "See that table? You own it now."

Here's the 3-phase strategy for zero-downtime, zero-data-loss migration.

Phase 1: the safety net (source stack)

The table definition cannot simply be deleted from the old stack. CloudFormation will interpret this as "destroy the resource." CloudFormation must first be instructed to abandon the resource gracefully.

Step 1: apply DeletionPolicy: Retain

Update the source YAML template by adding the DeletionPolicy attribute to every stateful resource to be migrated:

DataJobsMetadataTable:
  Type: AWS::DynamoDB::Table
  DeletionPolicy: Retain  # The most critical line in this guide
  Properties:
    TableName: production-data-jobs-metadata
    # ... rest of configuration

Deploy this change immediately:

aws cloudformation update-stack \
  --stack-name application-stack-prod \
  --template-body file://application-stack.yaml

What happens: nothing visible changes, but CloudFormation now knows to preserve the physical table even if it's later removed from the template.

Step 2: orphan the resource

Now remove the table definition entirely from the source template and redeploy:

aws cloudformation update-stack \
  --stack-name application-stack-prod \
  --template-body file://application-stack-trimmed.yaml

Result: the stack update completes successfully. The resource is removed from CloudFormation's management, but the physical DynamoDB table remains active and unchanged in the AWS account.

Critical checkpoint: verify that the physical table still exists and contains data before proceeding. Do NOT delete these tables manually. Do NOT let FinOps scripts run against them.

Phase 2: the adoption (target stack) - the correct way

This is the step that was skipped in the incident. Instead of creating new tables, the existing orphaned tables must be imported.

Step 1: match the configuration exactly

Critical: the new template definition must match the existing physical resource configuration precisely: KeySchema, BillingMode, SSESpecification, StreamSpecification, and all other attributes.

If there's any mismatch (e.g., template specifies PAY_PER_REQUEST but the actual table uses PROVISIONED), the import will fail.

Pro tip: run aws dynamodb describe-table on the existing resource and use the output to ensure the YAML is accurate.

aws dynamodb describe-table \
  --table-name production-data-jobs-metadata \
  --output json > current-table-config.json

Use this output to build the CloudFormation template that exactly matches the existing table.

Step 2: create the import mapping file

Create an import-resources.json file that maps logical IDs (from the template) to physical IDs (the actual AWS resource identifiers):

[
  {
    "ResourceType": "AWS::DynamoDB::Table",
    "LogicalResourceId": "DataJobsMetadataTable",
    "ResourceIdentifier": {
      "TableName": "production-data-jobs-metadata"
    }
  },
  {
    "ResourceType": "AWS::DynamoDB::Table",
    "LogicalResourceId": "DataSchemasTable",
    "ResourceIdentifier": {
      "TableName": "production-data-schemas"
    }
  },
  {
    "ResourceType": "AWS::SSM::Parameter",
    "LogicalResourceId": "JobsTableParameter",
    "ResourceIdentifier": {
      "Name": "/platform/prod/dynamodb/jobs-metadata"
    }
  }
]

Step 3: execute the import (NOT a regular stack creation)

This is the critical difference. Do NOT run aws cloudformation create-stack or aws cloudformation update-stack. Use the special IMPORT change set type:

# Create the import change set
aws cloudformation create-change-set \
  --stack-name infrastructure-stack-prod \
  --change-set-name import-dynamo-metadata-tables \
  --change-set-type IMPORT \
  --resources-to-import file://import-resources.json \
  --template-body file://infrastructure-stack.yaml

# Review the change set carefully in the AWS Console
# Verify it shows "Import" operations, NOT "Add" operations
# Then execute
aws cloudformation execute-change-set \
  --stack-name infrastructure-stack-prod \
  --change-set-name import-dynamo-metadata-tables

What happens: CloudFormation adopts the existing table without making any changes to it. The resource is now managed by the infrastructure-stack. All historical data remains intact.

Phase 3: drift detection & validation

Once the import completes, validate that everything is correctly configured.

Run drift detection

Execute a drift detection on the new stack:

aws cloudformation detect-stack-drift \
  --stack-name infrastructure-stack-prod

If any properties were missed in the YAML (a tag, a throughput setting, an encryption configuration), CloudFormation will report the resource as "Drifted." Update the template to match reality, then redeploy.

Verify data integrity

Most important: verify that the imported tables contain all historical data:

# Check item count
aws dynamodb describe-table \
  --table-name production-data-jobs-metadata \
  --query 'Table.ItemCount'

# Query for historical records
aws dynamodb query \
  --table-name production-data-jobs-metadata \
  --limit 10

If the tables are empty or show only recent data, the import was not done correctly, and the wrong tables were imported.

Verify tags

Ensure all required tags (Cost Center, Owner, Environment) are present. Once FinOps scripts are re-enabled, improperly tagged resources will be flagged for deletion.

Test application connectivity

Verify that data pipelines and cleaning jobs can still read to the tables. Since table names and configurations haven't changed, there should be no disruption, but always verify.

Validate file cleaning jobs

Run a dry-run of file cleaning processes to ensure they can still query the metadata tables correctly.

Conclusion

The difference between success and disaster comes down to one command: using --change-set-type IMPORT instead of creating new resources.

By properly importing existing tables rather than recreating them, it's possible to migrate terabytes of metadata across CloudFormation stacks with zero downtime and zero data loss, without triggering data recovery processes or disrupting cost optimization pipelines.

The key insight: CloudFormation's resource import feature transforms a terrifying migration into a controlled, auditable process. The DeletionPolicy: Retain acts as a safety net, and the import operation transfers ownership without touching the data.

For data platforms where metadata drives critical operations (schema management, pipeline orchestration, and storage cost optimization), this approach is not just recommended, it's essential.

The lesson learned: detaching tables from a stack is only half the battle. The next stack must import them, not recreate them. Skip this step and face data recovery work.

How to move DynamoDB tables between CloudFormation stacks without data loss