Beyond location management, Lake Formation offers a permission system reminiscent of SQL relational databases. You can grant precise privileges:
- On databases:
DESCRIBE
,CREATE TABLE
,DROP TABLE
,ALTER
. - On tables:
DESCRIBE
,INSERT
,DELETE
,ALTER
,DROP
,SELECT
(for all columns). - On columns:
SELECT
for a specific subset of columns (a key feature for sensitive data protection).
Lake Formation also introduces internal roles like "Lake Formation Admin" (a super-grantor role) and "Database Creators" (users with this role become "owners" of the databases they create, granting them full rights over contained objects).
The Challenge of Resource-Based Access (and Its Solution)
Pure resource-based governance (granting permissions directly on each database or table) presents a major challenge: you cannot grant permissions on a catalog object that doesn't exist yet. In a dynamic production environment with thousands of tables constantly being created, this quickly becomes an operational nightmare. Every new creation would require associated permission grants, leading to delays and significant manual effort.
The solution lies in Attribute-Based Access Control (ABAC), implemented via Lake Formation Tags.
Lake Formation Tags: Your Governance Metadata
Lake Formation tags are custom tags, distinct from standard AWS tags. They allow you to associate catalog objects (databases, tables, columns) with business or technical metadata.
They take the form Key : [Value1, Value2, ...]
. Examples of relevant tags for governance:
confidentiality
:['internal', 'public', 'restricted']
data_domains
:['finance', 'marketing', 'logistics']
privacy
:['advanced', 'intermediate', 'basic']
storage_zone
:['raw', 'formatted', 'curated']
You can associate multiple keys with a catalog object, but only one value per key. For example, confidentiality: restricted
and data_domains: finance
is valid for the same table.
Tag Policies: Granting Permissions at Scale
With tags, you grant permissions not on specific objects, but on objects that possess a given combination of tags. A tag policy is defined by:
- The permissions (e.g.,
SELECT
,INSERT
). - The tag keys and their values.
- The object type (database, table, column).
Default "AND" Logic: A policy with Tag1:ValueA
and Tag2:ValueB
applies to objects that have BOTH Tag1:ValueA
AND Tag2:ValueB
. For an "OR" logic, you'll need to create multiple distinct tag policies for the same principal: one for Tag1:ValueA
, another for Tag2:ValueB
.
Advantages of Tag Policies:
- Agility: No need to know objects in advance to grant permissions. Users automatically gain access to new data as soon as it's correctly tagged.
- Adaptability: Aligns perfectly with your metadata governance.
- Scalability: Drastically reduces the number of manual "grants" to manage.
However, this comes with a requirement: precise metadata governance and rigorous, automated application of tags. Without this, the system won't be reliable.