Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Applies to: (error) Kyvos Enterprise  (error) Kyvos Cloud (SaaS on AWS) (error) Kyvos AWS Marketplace

(tick) Kyvos Azure Marketplace   (error) Kyvos GCP Marketplace (error) Kyvos Single Node Installation (Kyvos SNI)


Before you start the automated installation of the Kyvos application on Azure, ensure that you have the following information.

Basic Configurations

To install Kyvos in your Azure environment, you must have an Azure account with an active subscription. 

Permissions 

Kyvos can be deployed from the Azure Marketplace using an existing or new resource group.

Important

  • To deploy Kyvos, you must have the required  permissions, as explained below. To obtain these permissions, contact your Azure Administrator. 

  • To verify your access, refer to Microsoft documentation. 

  • To verify access for a user to Azure resources, refer to Microsoft documentation. 

  • If you have an Owner/Contributor/Managed Application Contributor Role at the subscription level, you can skip the prerequisites for both new and existing resource groups.

  • Deploying Kyvos in a new Resource Group

  • Deploying Kyvos using the existing Resource Group 

    1. The Owner role must be assigned to the user on the Resource Group in which Kyvos is being created. 

    2. The custom role must be assigned to the user at the subscription level. Contact your administrator to create or share the name of the custom role.  

Quota

Certain quotas need to be checked for availability before deploying the Kyvos application. 

The required quota depends on the instance type, the number of Query Engines, and High Availability configurations (BI Server and Kyvos Web Portal). Refer to the following example for more details.

  • If you enable High Availability with Standard_D16s_v4 VM Size for BI Server Instance, you must ensure that a total of 40 Standard Dsv4 Family vCPUs Quota is available (2 VMs * 16 vCPUs for BI servers and 2 VMs* 4 vCPUs for Web Portal instance)

  • If you select a Standard_E16ds_v4 VM Size for the Query Engine server instance and set the instance count to 5, then you must ensure that a total of 80 Standard EdSv4 Family vCPUs Quota is available (5 VMs* 16 vCPUs)

If you already have the required quota limit to deploy Kyvos resources, you can skip increasing the quota limit. To learn about how to check quotas, refer to Microsoft view quotas documentation

If you require to increase the quota limit to deploy Kyvos resources, refer to the Microsoft quota increase documentation to learn about how to request a quota increase in the Azure portal.   

Register Microsoft Resource Providers at the Subscription Level 

To deploy Kyvos, ensure that the following Microsoft Resource Providers are registered at the subscription level.

To learn about how to verify and register Resource Provider, see the Verifying and Registering Microsoft Providers section. 

Important

If you are unable to register Microsoft Resource Providers, contact your Azure Account Administrator to do so. 

Microsoft Resource Providers

  • Microsoft.Storage 

  • Microsoft.Compute

  • Microsoft.ManagedIdentity

  • Microsoft.Network

  • Microsoft.KeyVault

  • Microsoft.insights

  • Microsoft.Web

  • Microsoft.Databricks

Network Configurations 

For existing virtual network, you must have the following permissions on the existing network:

Note

This is not required if you are creating network resources using the Kyvos provided template.

  • Network/virtualNetworks/subnets/read

  • Network/virtualNetworks/read

  • Network/virtualNetworks/subnets/joinViaServiceEndpoint/action

  • Network/virtualNetworks/subnets/write

  • Network/virtualNetworks/subnets/join/action

OR

The Network Contributor role must be assigned to the user. See the Configuring Roles for Deployment User section for details on creating and assigning roles.

Prerequisites

  1.  Two subnets must be available for the deployment of Kyvos. These subnets must be within the required CIDR Range for the deployment of Kyvos Azure Marketplace:

    1. Subnet for Kyvos Instances: /16 to /26

    2. Subnet for Application Gateway: /16 to /27

  2. No subnet delegations attached to any of the subnets.

  3. Service Endpoints are required on the Subnet for Kyvos Instances:

    1. Azure Storage (Microsoft.Storage): This model secures and controls the level of access to your storage accounts so that only applications requesting data over the specified set of networks or through the specified set of Azure resources, can access a storage account.

    2. Azure Key Vault (Microsoft.KeyVault): The virtual network service endpoints for Azure Key Vault allow you to restrict access to a specified virtual network and a list of IPv4 (Internet Protocol version 4) address ranges.

    3. Azure App Service (Microsoft.Web): By setting up access restrictions, you can create a priority-ordered allow/deny list to control network access to your application.

Databricks Configurations

The Kyvos application requires a Databricks cluster for building cubes and data profiling jobs.

Note

  • You must have access to the existing Databricks workspace, or you can create a new one. Refer to Microsoft documentation to create an Azure Databricks workplace. 

  • You can either create a new Databricks cluster along with the Kyvos application or configure your own/existing Databricks cluster to deploy Kyvos using the Azure Marketplace wizard. 

Following are the common prerequisites when using the Azure Marketplace wizard to create a new Databricks cluster or to use your own/existing Databricks cluster. 

  1. Workspace URL: See Microsoft Documentation to learn about how to get a workspace URL.

  2. Azure Databricks Personal Access Token: You can use an existing token or create a new token.
    To create an Azure Databricks personal access token for an Azure Databricks user, perform the following steps. 

    1. Login to your Azure Databricks workspace by using the URL you obtained from Step 1. 
      NOTE: If you are unable to log in, contact your administrator to give you access. Refer to Microsoft documentation to learn about how to add a user to your Azure Databricks account using the account console. 

    2. In your Azure Databricks workspace, click your Azure Databricks username in the top right bar, and then select User Settings from the list.  

    3. On the Access tokens tab, click Generate new token.
      Optionally, enter a comment that helps you to identify this token in the future and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).  

    4. Click Generate .  

    5. Copy the displayed token, and then click Done .  

  3. Service Principal: You need Service Principal Client ID, Service Principal Client Secret, and Tenant ID . You can either create a new Service Principal or use an existing Service Principal.

    • New Service Principal

      1. You must have the required rights to create a new Service Principal. 

      2. Refer to the Provision a service principal in Azure portal section to learn about how to create an Azure AD Service Principal and obtain the Service Principal Client ID (also known as Application (client) ID), the Service Principal Client Secret and Tenant ID (also known as Directory (tenant) ID). 

    • Existing Service Principal 

      1. If you do not have permission to view the Service Principal (App registration), contact your administrator.

      2. To get the details from the existing Azure AD Service Principal, see the Getting Azure AD Service Principal Details section. 

  4. Service Principal Object ID:

    1. To obtain this, search App registration in Azure Portal  > Search and select your Service Principal, which you have created or used in Step 3, as explained above. 

    2. Click the value of Managed application in local directory. The overview page will display the Object ID of the Service Principal.

    3. To use the existing Databricks cluster, refer to Databricks documentation to learn about how to get the Cluster ID

Existing Databricks Cluster

To configure the existing Databricks cluster to deploy the Kyvos application, perform the following steps. 

  1. Log in to Databricks. 

  2. Click Compute.

  3. Select the cluster that you want to configure in Kyvos.

  4. Configure the existing Databricks cluster as follows: 

    ParametersDescription

    Databricks Runtime Version

    Kyvos supports:

    • Version 7.3 LTS (includes Apache Spark 3.0.1, Scala 2.12)

    • Version 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12)

    • Version 10.4 LTS (includes Apache Spark 3.1.2, Scala 2.12)

    Autoscaling Options

    1. Enable autoscaling: Select this to enable autoscaling.

    2. Terminate after ___ minutes of inactivity: Set the value as 30.

    Worker type

    Recommended type Standard_E16ds_v4

    1. Min Worker: Recommended value 1

    2. Max Workers: Recommended value 10

    3. To use Databricks with Spot Instances, select the corresponding checkbox. (not recommended for production use) 

    Driver Type

    Recommended type Standard_E8ds_v4

  5. In the Advanced Options, define the Spark Configurations as follows:
    Sample configuration: 

    spark.sql.parquet.int96AsTimestamp true
    spark.databricks.delta.preview.enabled true
    spark.hadoop.spark.sql.parquet.binaryAsString false
    spark.hadoop.fs.azure.account.oauth2.client.secret {Service-Principal-Client-Secret} 
    spark.databricks.preemption.enabled false
    spark.hadoop.fs.azure.account.oauth2.client.endpoint https://login.microsoftonline.com/{Tenant-ID}/oauth2/token
    spark.sql.parquet.binaryAsString false
    spark.databricks.service.server.enabled true
    spark.hadoop.fs.azure.account.oauth2.client.id {Service-Principal-Client-ID}
    spark.hadoop.fs.azure.account.auth.type OAuth
    spark.hadoop.fs.azure.account.oauth.provider.type org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider
    spark.hadoop.spark.sql.parquet.int96AsTimestamp true

Important

  • To use Databricks 10.4 LTS, you need to update the following properties in Databricks Advance options >Spark Configurations.

    • spark.sql.caseSensitive false

    • spark.hadoop.spark.sql.caseSensitive false

  • Replace the details of Service-Principal-Client-Secret, Tenant-ID, and Service-Principal-Client-ID with the value you have obtained in step 3, as explained above. 


Related topics

  • No labels