TIL #2: From SQL-like to ABAC: Elevating Your Data Governance with Lake Formation Tags

Tue Jul 29 2025

Beyond location management, Lake Formation offers a permission system reminiscent of SQL relational databases. You can grant precise privileges:

  • On databases: DESCRIBECREATE TABLEDROP TABLEALTER.
  • On tables: DESCRIBEINSERTDELETEALTERDROPSELECT (for all columns).
  • On columns: SELECT for a specific subset of columns (a key feature for sensitive data protection).

Lake Formation also introduces internal roles like "Lake Formation Admin" (a super-grantor role) and "Database Creators" (users with this role become "owners" of the databases they create, granting them full rights over contained objects).


The Challenge of Resource-Based Access (and Its Solution)

Pure resource-based governance (granting permissions directly on each database or table) presents a major challenge: you cannot grant permissions on a catalog object that doesn't exist yet. In a dynamic production environment with thousands of tables constantly being created, this quickly becomes an operational nightmare. Every new creation would require associated permission grants, leading to delays and significant manual effort.

The solution lies in Attribute-Based Access Control (ABAC), implemented via Lake Formation Tags.


Lake Formation Tags: Your Governance Metadata

Lake Formation tags are custom tags, distinct from standard AWS tags. They allow you to associate catalog objects (databases, tables, columns) with business or technical metadata.

They take the form Key : [Value1, Value2, ...]. Examples of relevant tags for governance:

  • confidentiality['internal', 'public', 'restricted']
  • data_domains['finance', 'marketing', 'logistics']
  • privacy['advanced', 'intermediate', 'basic']
  • storage_zone['raw', 'formatted', 'curated']

You can associate multiple keys with a catalog object, but only one value per key. For example, confidentiality: restricted and data_domains: finance is valid for the same table.


Tag Policies: Granting Permissions at Scale

With tags, you grant permissions not on specific objects, but on objects that possess a given combination of tags. A tag policy is defined by:

  • The permissions (e.g., SELECTINSERT).
  • The tag keys and their values.
  • The object type (database, table, column).

Default "AND" Logic: A policy with Tag1:ValueA and Tag2:ValueB applies to objects that have BOTH Tag1:ValueA AND Tag2:ValueB. For an "OR" logic, you'll need to create multiple distinct tag policies for the same principal: one for Tag1:ValueA, another for Tag2:ValueB.


Advantages of Tag Policies:

  • Agility: No need to know objects in advance to grant permissions. Users automatically gain access to new data as soon as it's correctly tagged.
  • Adaptability: Aligns perfectly with your metadata governance.
  • Scalability: Drastically reduces the number of manual "grants" to manage.

However, this comes with a requirement: precise metadata governance and rigorous, automated application of tags. Without this, the system won't be reliable.