Document toolboxDocument toolbox

Replace partitions

Applies to: Kyvos Enterprise  Kyvos Cloud (SaaS on AWS) Kyvos AWS Marketplace

Kyvos Azure Marketplace   Kyvos GCP Marketplace Kyvos Single Node Installation (Kyvos SNI)


Use Replace Partition to update the data during an incremental process. For example, when there is a change in user data, Kyvos identifies that change and replaces it in semantic model data, which was already processed. This option is only available if you have time or date data available.

How replacing partitions works

Data for old partitions may be provided with new datasets, or the same dataset may be updated. Files with an updated timestamp are only processed when Replace Partition is selected.

Warning

Don’t use Replace Partition with materialized transformations. When you replace partitions of materialized data, you run the risk of duplicating data because the underlying data may not have the same partitioning and replacement policy as you are specifying in the dialog box. If  you are using a materialized transformation, you will see a warning message


When the semantic model is updated, Kyvos evaluates which rows are intended for the older semantic model partitions and considers that a partition replacement. Kyvos identifies the older semantic model partitions using the Time dimension field and checks to see if this value already exists in the semantic model. The other changed rows are considered part of the incremental build.

For example, a daily incremental build is scheduled for a semantic model, and partitioning is defined as a month’s worth of data. The semantic model contains data for the last five months. Partitions 1 to 4, each contain a full month’s worth of data. Partition 5 contains data from the 1st-15th of the most recent month.

Let’s say you want to provide updated data for partition #2, and you also have additional data for the most recent partition that has not been loaded to the semantic model. (In this case, data from the 16th of the month to the 21st). If you use an incremental build with the Partitions option, any new or updated data for existing partitions is replaced. The data from the 16th to the 21st is added to partition #5.

Updated partition information is included in the build summary.

To replace partitions, perform the following steps.

  1. From the Toolbox, choose Semantic Models.

  2. Select a semantic model that includes time data.

  3. Click the Build tab then click Schedule Job.
    The Add Job dialog is displayed.  

  4. In the JobType list, select Incremental Build.

  5. For Replace Partition option, select Auto.

  6. Click Schedule. 

How Sub Partitions Interact

You can set up multiple sub partitions. For example, partition first by Day, and then set up a sub partition by Continent.  If there are changes to continent data, only the parts of those sub partitions containing that data will be updated. For example, if there is sub partition data for Asia and Africa and only the Asia data is updated, only that data will be replaced.

Scheduling partition updates 

When you schedule processes, you can specify to replace partitions. You can add rules or provide criteria. The partition is dropped or replaced as a whole. For example, if each partition is a month, then the entire month is dropped or replaced.

When you use incremental processes, you can set the Replace Partitions option to None or Auto. 

  • None doesn't replace the partition. This option works like an incremental process when new data arrives and does not replace the partition.

  • Auto identifies the partition automatically.


Related topics

Copyright Kyvos, Inc. All rights reserved.