Data Reflections course - 7

Mar 22, 2022 11:24

Reflection refresh

Refresh policies
As physical datasets change, reflections should be refreshed.
- Scheduled refresh: You select a refresh interval -- typically some number of weeks or days, but some use cases call for multiple refreshes per hour. Dremio recommends an expiration interval longer than a refresh interval in the event that a refresh fails.

- Triggered refresh: Appropriate when an ETL job or external process needs to trigger a refresh. Choose an expiration interval that is the longest time the data in the reflection could be valid
Combination: When data is updated both on a schedule and occasionally on demand, Dremio supports using both a scheduled and triggered refresh. For example, if a reflection is scheduled to be refreshed every 24 hours and a triggered refresh is initiated, the next scheduled refresh will occur 24 hours after the triggered refresh.

Refresh methods
Data Reflections can be updated in one of two methods:
- Full Refresh: the entire reflection will be rebuilt
- Incremental Refresh: the reflection will be updated based on new data since the last refresh job. Incremental refreshes are only possible on qualifying datasets. Any downstream reflections that join an incrementally-refreshed reflection need to be fully refreshed since the join may have removed rows needed for the new data.

Dremio recommends that new reflection builds be directed to a separate queue in Dremio Workload Manager. You can adjust the settings of this queue to provide sufficient resources to build refreshed reflections.





dremio (learning)

Previous post Next post
Up