Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

Applies to: (tick) Kyvos Enterprise  (tick) Kyvos Cloud (SaaS on AWS) (tick) Kyvos AWS Marketplace

(tick) Kyvos Azure Marketplace   (tick) Kyvos GCP Marketplace (tick) Kyvos Single Node Installation (Kyvos SNI)


Kyvos validates and shows recommendations specific to spark JDBC parallelism when a registered dataset is designed on a connection using JDBC to interact with the data source and also uses spark JDBC to read the data from the source. Partitions help in the optimization of queries, particularly for data warehouses.

To define file partitions, perform the following steps. 

  1. Click the File Partition option.

  2. On the Partition Details dialog box, you will see a column selected according to system recommendation.

  3. By default, the Fetch Mode is selected as Automatic. Kyvos intelligently determines min, max values, and the number of partitions for the selected partition column.
    Only date and integers are visible. Strings are not allowed.

  4. To use a different column for partitioning, choose it from the Select Column list. This could be any metadata column. 
    If the selected column is different from the system-recommended column, you will see a message, as shown in the figure below.

    1. In this case, you need to fetch the metadata for the column using the Click here link.

    2. Set the Fetch Mode as Manual and manually provide the number of partitions and min, max values.

      1. Min Value: Specify the minimum value of the selected column to be considered for parallel processing. You can also select a parameter to specify the min value. Ensure that the data type is the same as the column selected above.

      2. Max Value: Specify the maximum value of the selected column to be considered for parallel processing. You can also select a parameter to specify the min value. Ensure that the data type is the same as the column selected.

      3. Partition Count: Provide the maximum number of partitions to be used for parallel processing of table data. It also defines the maximum number of concurrent JDBC connections that will be triggered for computation.

  5. Click the Apply button.

  • No labels