Document toolboxDocument toolbox

Before you begin with Kyvos Free on Azure Marketplace

Applies to: Kyvos Enterprise  Kyvos Cloud (SaaS on AWS) Kyvos AWS Marketplace

Kyvos Azure Marketplace   Kyvos GCP Marketplace Kyvos Single Node Installation (Kyvos SNI)


Important

By default, the Kyvos SNS is created with a node size of Standard D16s v4 (16 vCPUs, 64 GB memory) and 500 GB disk size.

Before you start the automated installation of the Kyvos Free application on Azure Marketplace, ensure that you have the following information.

Basic Configurations

To install Kyvos in your Azure environment, you must have an Azure account with an active subscription. 

Permissions

Kyvos can be deployed from the Azure Marketplace using an existing or new resource group.

Important

  • To deploy Kyvos Free, you must have the required permissions, as explained below. To obtain these permissions, contact your Azure Administrator. 

  • To verify your access, refer to Microsoft documentation. 

  • To verify access for a user to Azure resources, refer to Microsoft documentation. 

  • If you have an Owner/Contributor/Managed Application Contributor Role at the subscription level, you can skip the prerequisites for both new and existing resource groups.

  • Deploying Kyvos in a new Resource Group

  • Deploying Kyvos using the existing Resource Group

    • The owner role must be assigned to the user on the Resource Group in which Kyvos is being created. 

    • The custom role must be assigned to the user at the subscription level. Contact your administrator to create or share the name of the custom role.   

Register Microsoft Resource Providers at the Subscription Level 

To deploy Kyvos, ensure that the following Microsoft Resource Providers are registered at the subscription level.

To learn about how to verify and register Resource Provider, see the Verifying and Registering Microsoft Providers section. 

Important

If you are unable to register Microsoft Resource Providers, contact your Azure Account Administrator to do so. 

Microsoft Resource Providers

  • Microsoft.Storage 

  • Microsoft.Compute

  • Microsoft.ManagedIdentity

  • Microsoft.Network

  • Microsoft.KeyVault

  • Microsoft.insights

  • Microsoft.Web

  • Microsoft.Databricks

Network Configurations 

For existing virtual network, you must have the following permissions on the existing network:

Note

This is not required if you are creating network resources using the Kyvos provided template.

  • Network/virtualNetworks/subnets/read

  • Network/virtualNetworks/read

  • Network/virtualNetworks/subnets/joinViaServiceEndpoint/action

  • Network/virtualNetworks/subnets/write

  • Network/virtualNetworks/subnets/join/action

OR

The Network Contributor role must be assigned to the user. See the Configuring Roles for Deployment User section for details on creating and assigning roles.

Prerequisites

  1.  Two subnets must be available for the deployment of Kyvos. These subnets must be within the required CIDR Range for the deployment of Kyvos Azure Marketplace:

    1. Subnet for Kyvos Instances: /16 to /26

    2. Subnet for Application Gateway: /16 to /27

  2. No subnet delegations attached to any of the subnets.

  3. Service Endpoints are required on the Subnet for Kyvos Instances:

    1. Azure Storage (Microsoft.Storage): This model secures and controls the level of access to your storage accounts so that only applications requesting data over the specified set of networks or through the specified set of Azure resources, can access a storage account.

    2. Azure Key Vault (Microsoft.KeyVault): The virtual network service endpoints for Azure Key Vault allow you to restrict access to a specified virtual network and a list of IPv4 (Internet Protocol version 4) address ranges.

    3. Azure App Service (Microsoft.Web): By setting up access restrictions, you can create a priority-ordered allow/deny list to control network access to your application.

Databricks Configurations

The Kyvos application requires a Databricks cluster for processing semantic models and data profiling jobs.

Note

  • You must have access to the existing Databricks workspace, or you can create a new one. Refer to Microsoft documentation to create an Azure Databricks workplace. 

  • You can either create a new Databricks cluster along with the Kyvos application or configure your own/existing Databricks cluster to deploy Kyvos using the Azure Marketplace wizard. 

Following are the common prerequisites when using the Azure Marketplace wizard to create a new Databricks cluster or to use your own/existing Databricks cluster. 

  1. Workspace URL: See Microsoft Documentation to learn about how to get a workspace URL.

  2. Azure Databricks Personal Access Token: You can use an existing token or create a new token.
    To create an Azure Databricks personal access token for an Azure Databricks user, perform the following steps. 

    1. Login to your Azure Databricks workspace by using the URL you obtained from Step 1. 
      NOTE: If you are unable to log in, contact your administrator to give you access. Refer to Microsoft documentation to learn about how to add a user to your Azure Databricks account using the account console. 

    2. In your Azure Databricks workspace, click your Azure Databricks username in the top right bar, and then select User Settings from the list.  

    3. On the Access tokens tab, click Generate new token.
      Optionally, enter a comment that helps you to identify this token in the future and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).  

    4. Click Generate.  

    5. Copy the displayed token, and then click Done.  

  3. Service Principal: You need Service Principal Client ID, Service Principal Client Secret, and Tenant ID . You can either create a new Service Principal or use an existing Service Principal.

    • New Service Principal

      1. You must have the required rights to create a new Service Principal. 

      2. Refer to the Provision a service principal in Azure portal section to learn about how to create an Azure AD Service Principal and obtain the Service Principal Client ID (also known as Application (client) ID), the Service Principal Client Secret and Tenant ID (also known as Directory (tenant) ID). 

    • Existing Service Principal 

      1. If you do not have permission to view the Service Principal (App registration), contact your administrator.

      2. To get the details from the existing Azure AD Service Principal, see the Getting Azure AD Service Principal Details section. 

  4. Service Principal Object ID:

    1. To obtain this, search App registration in Azure Portal  > Search and select your Service Principal , which you have created or used in Step 3, as explained above. 

    2. Click the value of Managed application in local directory. The overview page will display the Object ID of the Service Principal.

      To use the existing Databricks cluster, refer to Databricks documentation to learn about how to get the Cluster ID. 

Existing Databricks Cluster 

To configure the existing Databricks cluster to deploy the Kyvos application, perform the following steps. 

  1. Log in to Databricks. 

  2. Click Compute.

  3. Select the cluster that you want to configure in Kyvos.

  4. Configure the existing Databricks cluster as follows: 

  5. In the Advanced Options, define the Spark Configurations as follows:
    Sample configuration: 

    spark.sql.parquet.int96AsTimestamp true spark.databricks.delta.preview.enabled true spark.hadoop.spark.sql.parquet.binaryAsString false spark.hadoop.fs.azure.account.oauth2.client.secret {Service-Principal-Client-Secret} spark.databricks.preemption.enabled false spark.hadoop.fs.azure.account.oauth2.client.endpoint https://login.microsoftonline.com/{Tenant-ID}/oauth2/token spark.sql.parquet.binaryAsString false spark.databricks.service.server.enabled true spark.hadoop.fs.azure.account.oauth2.client.id {Service-Principal-Client-ID} spark.hadoop.fs.azure.account.auth.type OAuth spark.hadoop.fs.azure.account.oauth.provider.type org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider spark.hadoop.spark.sql.parquet.int96AsTimestamp true

Important

  • To use Databricks 10.4 LTS, you need to update the following properties in Databricks Advance options >Spark Configurations.

    • spark.sql.caseSensitive false

    • spark.hadoop.spark.sql.caseSensitive false

  • Replace the details of Service-Principal-Client-Secret, Tenant-ID, and Service-Principal-Client-ID with the value you have obtained in Step 3, as explained above. 


Related topics: 

 

Copyright Kyvos, Inc. All rights reserved.