Data Reflections course

Mar 18, 2022 17:41


https://university.dremio.com/courses/course-v1:dremio+D102+2019/about

Reflections are optimized data structures that can dramatically accelerate query execution.

Reflections are physically optimized representations of source data. When they contain all the data needed by a query, they are used by Dremio to satisfy a query in place of source data.
If the data lake contains row-oriented data formats like CSV or JSON, a reflection will dramatically improve query performance over using the original raw data
Reflections store pre-computed results for later use. Complicated joins and data transformations can be computed in a reflection and reused, reducing the work that each query needs to perform
Reflections can partially or completely satisfy queries in place of source data, meaning reflections can be combined and reused in conjunction with other query filters, operations, and optimizations
Since reflections are an optional optimization transparent to users, you can add them iteratively when needed and tune your reflection strategy over time
Dremio provides two key reflection types:
- Raw Reflection - One or more columns from the dataset, preserves row-level fidelity
- Aggregate Reflection -  Precomputed measures for all dimension combinations, like a materialized aggregate

dremio (learning)

Previous post Next post
Up