Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Parameter

Description

Subscription*

Your account subscription.

Resource Group*

Enter the name of your resource group. The resource group is a collection of resources that share the same lifecycle, permissions, and policies.

Region*

Choose the Azure region that's right for you and your customers. Not every resource is available in every region.

VnetAddress

Enter the CIDR notation for the new VPC that will be created in the deployment.

NOTE: This option is displayed only when the  CreateVPC  option is selected.
If a new VPC is created and you have enabled  WebPortal  HA (from the Kyvos Manager), then you must perform the  post-deployment steps  after deploying the cluster.

NetworkSecurityGroupIpWhiteList

Provide the range of IP addresses allowed to access Kyvos Instances. Use 0.0.0.0/0 to allow all users access.

NOTE: This parameter is displayed only when a new network security group is created within the deployment. 

Virtual Network Name*

Name of Virtual Network in which your VMs will run.

VM Subnet Name*

Name of Subnet in which your VMs will run. This Subnet should be part of the above Virtual Network.

ApplicationGatewaySubnetName* 

Name of the Subnet in which Application Gateway will be created. The Subnet should be part of the above Virtual Network. 

NOTE: This parameter will display only if an existing VPC is used for deployment.

Security Group Name*

Name of the Security group that can be used to access the VMs.

Network Resource Group Name*

Name of Resource Group in which Virtual Network and Subnet are deployed.

Security Group Resource Group Name

Name of Resource Group in which SecurityGroup is deployed.

Enable Managed Identity Creation

Select True to Create New Managed Identity for Kyvos.

Select False to use an already existing managed identity.

Managed Identity Name*

Enter the name of User-Managed Identity to be attached with all Kyvos VMs.

Managed Identity Resource Group Name

The Name of Resource Group in which Managed Identity is deployed.

Databricks Authentication Type

Select the authentication type for the Databricks cluster from:

  • AAD Token Using Managed Identity: This option is supported only with premium workspace.

  • Personal Access Token

Databricks Token*

Specifies the value of the token used to connect to Databricks Cluster

Kyvos Work Directory

Enter the path for the Kyvos work directory.

SSH Public Key*

Provide an RSA public key in the single-line format (starting with "ssh-rsa") or the multi-line PEM format.

You can generate SSH keys using ssh-keygen on Linux and OS X, or PuTTYGen on Windows.

Additional Tags

Enter the additional tags to put on all resources.

Use the syntax as: {"Key1": "Value1", "Key2" : "Value2"}

Storage Account Name

Enter the name of the Storage Account to be used for Kyvos.

Storage Account Container Name

Enter the name of Container in Storage Account which will be used for Kyvos.

CustomPrefixVirtualMachines

Enter a custom prefix that you want to append before the name of the virtual machines to be used for Kyvos.

CustomPrefixVPC

Enter the custom prefix you want to append before the name of VPC in case a new VPC is created for use with Kyvos.

CustomPrefixNSG

Enter the custom prefix you want to append before the name of the Network Security Group in case a new group is created for use with Kyvos.

CustomPrefixKeyVault

Enter the custom prefix you want to append before the name of Key Vault in case a new Key Vault is created for use with Kyvos.

CustomPrefixScaleSet

Enter the custom prefix you want to append before the name of Scaleset that will be created for use with Kyvos.

Vault URL*

If you have saved your secrets in the Key Vault, provide its URL.

Vault Resource Group*

Enter the name of the Resource Group in which the Key Vault is deployed.

Boot Diagnostics Storage Account Resource ID

Resource ID of a storage account of type gen1 for enabling Boot Diagnostics of VMs. If left blank Storage Account of type gen1 will be created.

Storage Account Resource Group

Enter the name of the Resource Group in which the Storage Account is deployed.

Object Id of Service Principal*

The Object ID assigned to the Service principal. This maps to the ID inside the Active Directory.

SSH Private Key*

Provide the RSA private key in a single-line format.

Kyvos Cluster Name

Provide a name for your Kyvos cluster.

Kyvos Installation Path

Enter the installation path to deploy Kyvos.

Databricks URL*

Provide the URL in <https://<account>.cloud.databricks.com> format.

Databricks Cluster ID*

Enter the Cluster ID of your Azure cluster.

To obtain this ID, click the Cluster Name on the Clusters page in Databricks.

The page URL shows <https://<databricks-instance>/#/settings/clusters/<cluster-id>. The cluster ID is the number after the /clusters/ component in the URL of this page.

Databricks Cluster Organization ID*

Enter the Cluster Organization ID of your Azure cluster. To obtain this ID, click the Cluster Name on the Clusters page in Databricks.
The number after o= in the workspace URL is the organization ID. For example, if the workspace URL is  https://westus.azuredatabricks.net/?o=7692xxxxxxxx , then the organization ID is 7692xxxxxxxx.

Postgres Password*

Provide the password to be used for Postgres.

License File Value*

Enter valid Kyvos license key.

Secret Key For Kyvos Bundle Download*

Enter the Secret key to access Kyvos bundle.

Enable Public IP

Select True to enable Public IP for Kyvos Web portal.

DNS Label Prefix

Unique DNS Name for the Public IP used to access the Virtual Machine.

Perform Env Validation

Select True to perform environment validation before cluster deployment to ensure all the resources are created correctly.

Host Name Based Deployment  

Select True to use hostnames instead of IP Addresses for instances during cluster deployment.

...

  1. Configure the CDP cluster to access the storage account used by Kyvos from within Spark jobs.

    1. Log in to the Cloudera management console and go to Data Hub Clusters.

    2. Open the Cloudera Manager application by clicking the CM URL link, as shown in the following figure.

    3. On the Compute cluster, open Spark configurations from the Services dropdown.

    4. On the Configuration tab, search for the yarn.access.hadoopFileSystems property, and replace the existing value with the location of the container used in Kyvos. If you do not find the property, then add a new property with value as the container location.
      For example:
      yarn.access.hadoopFileSystems=abfs://data@kyvoscdp.dfs.core.windows.net,abfs://kyvoscontainer92604@kyvos33333.dfs.core.windows.net
      Here, the highlighted part is added to the existing value.

    5. Restart affected services.

  2. Configure IDBroker mapping for Kyvos user on CDP, using the following steps.

    1. On the navigation pane, click Environments, and then click Actions > Manage Access.

    2. Click the IDBroker Mappings tab and add user or group by clicking the Edit option in the Current Mappings section.

    3. Click the plus icon to add a user or group, select a user or group from the dropdown, and enter a role in the Role input.

    4. Click Save and Sync .

  3. Ensure that the CDP DataAssumerRole is assigned to the Kyvos storage account with Storage blob data contributor.

  4. Add cacerts file from /opt to /data/kyvos/app/kyvos/jre/jre/lib/security folder on all nodes (Kyvos Manager, BI Server, and Query Engines). This file is added from the CDP master node to /opt at the time of image creation.

  5. Make the Host entry in /etc/hosts of CDP master hostname with CDP master private IP address on BI server nodes.

  6. Change vendor from Databricks to Cloudera from Kyvos Manager, using the following steps.

    1. Log on to Kyvos Manager, and navigate to Manage Kyvos > Hadoop Ecosystem Configuration.

    2. Change the value for vendor from DATABRICKS to CLOUDERA .

    3. Change file system type from ABFSS to ABFS.

    4. In the HDInsights Home field, provide the value as /opt/cloudera/parcels/CDH/meta/

    5. Provide the Namenode IP.

    6. Change the value for Hadoop version to 3.1.1

    7. Provide Cloudera version in Hadoop parameters as 7.2.2

    8. Select Hive version as 3.1

    9. Provide Hive JDBC URL. You can copy the Hive JDBC URL from the Cloudera manager application.

    10. Select Spark version as 2.4

    11. Update the Spark Library Path to /opt/cloudera/parcels/CDH/lib/spark/jars,/opt/cloudera/parcels/CDH/jars/spark-hive_2.11-2.4.5.7.2.2.7-6.jar,/opt/cloudera/parcels/CDH/jars/spark-atlas-connector-assembly-0.1.0.7.2.2.7-6.jar

    12. Provide Spark history server URL.

    13. Select both Sync Library and Sync Configurations.

  7. On the Security configuration, provide details as:

    1. Change the Hadoop Security Type from SIMPLE to KERBEROS.

    2. Provide Keytab User Name and Keytab file details.
      Refer to the Cloudera documentation to download the Keytab file .

    3. Upload local policy jar and US export policy jar.

  8. Go back to Hadoop Ecosystem Configurations  screen on Kyvos Manager. Select both Sync Library and Sync Configurations and submit.

  9. Click Apply from the top-right of the screen.

  10. Change the owner of azcopy executable file from adminuser to kyvos.

    1. Location : /data/kyvos/installs/bin

    2. This needs to be done on all nodes for Kyvos Manager, BI Server, and Query engines.

    3. Restart the Kyvos services.

  11. To design a cube semantic model over data residing on the Snowflake data source, then add the following property on the Kyvos connection screen and restart the BI service.

...

This completes Kyvos installation and deployment in your environment, you can now access Kyvos to start creating your cubessemantic models.