Homeโ€บ๐Ÿ”„ Platform Overviewโ€บModule 22 min read ยท 3/21

Grail: The New Data Lakehouse

Tutorial3 exercises

What is Grail?

Grail is Dynatrace's unified data lakehouse. In Gen2, data was stored in separate backends โ€” metrics in one place, logs in another, traces in a third. In Gen3, everything goes into Grail and is queryable with DQL.

GEN2 Data Storage                       GEN3 Grail
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Metrics โ†’ Timeseries DB                 Grail โ†’ timeseries command
Logs โ†’ Log storage (Elasticsearch)      Grail โ†’ fetch logs
Traces โ†’ PurePath storage               Grail โ†’ fetch spans
Entities โ†’ Topology DB                  Grail โ†’ smartscapeNodes
User sessions โ†’ USQL storage            Grail โ†’ fetch user.events
Events โ†’ Event store                    Grail โ†’ fetch events
Business events โ†’ (didn't exist)        Grail โ†’ fetch bizevents
Security events โ†’ (limited)             Grail โ†’ fetch security.events
Problems โ†’ Problem feed                 Grail โ†’ fetch dt.davis.problems

Buckets and Tables

Grail organizes data into buckets and tables:

  • Default bucket โ€” where all standard data lands (logs, metrics, spans, events)
  • Custom buckets โ€” you can create buckets with different retention policies
  • Tables โ€” each data type has its own table within a bucket
Bucket: default
  โ”œโ”€โ”€ logs          (application + infrastructure logs)
  โ”œโ”€โ”€ spans         (distributed traces)
  โ”œโ”€โ”€ events        (Davis events, custom events)
  โ”œโ”€โ”€ bizevents     (business events)
  โ””โ”€โ”€ metrics       (timeseries data)

Bucket: custom_long_retention
  โ””โ”€โ”€ logs          (compliance logs, 5-year retention)

Retention

Gen2 had fixed retention per data type. Gen3 lets you configure retention per bucket:

GEN2 Retention                          GEN3 Retention
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Logs: 35 days (fixed)                   Logs: configurable per bucket
Metrics: 10 years (fixed)               Metrics: configurable per bucket
Traces: 10 days (fixed)                 Spans: configurable per bucket
User sessions: 35 days (fixed)          User events: configurable per bucket

OpenPipeline

Gen3 introduces OpenPipeline โ€” a processing engine that sits between data ingestion and storage. It replaces log processing rules, metric extraction, and data routing.

Data Source โ†’ OpenPipeline โ†’ Routing โ†’ Bucket/Table
                  โ”‚
                  โ”œโ”€โ”€ Parse (extract fields)
                  โ”œโ”€โ”€ Filter (drop unwanted data)
                  โ”œโ”€โ”€ Transform (enrich, rename)
                  โ”œโ”€โ”€ Extract metrics (create metrics from logs)
                  โ””โ”€โ”€ Route (send to specific bucket)

In Gen2, you configured log processing rules in Settings. In Gen3, you configure OpenPipeline rules โ€” same concept, more powerful, and it works for all data types.

Data Access Permissions

Grail introduces field-level and record-level access control:

  • Bucket-level โ€” who can read/write which buckets
  • Table-level โ€” who can query which tables (logs, spans, etc.)
  • Record-level โ€” filter data based on attributes (e.g., only see logs from your team's services)
  • Field-level โ€” hide sensitive fields (e.g., mask PII in log content)

This is a massive upgrade from Gen2 where data access was all-or-nothing per environment.