Applies to: Kyvos Enterprise Kyvos Cloud (SaaS on AWS) Kyvos AWS Marketplace
...
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
ImportantBy default, the Kyvos SNS is created with a node size of Standard D16s v4 (16 vCPUs, 64 GB memory) and 500 GB disk size. |
Before you start the automated installation of the Kyvos application on Azure, ensure that you have the following information.
...
To install Kyvos in your Azure environment, you must have an Azure account with an active subscription.
Permissions Permissions
Kyvos can be deployed from the Azure Marketplace using an existing or new resource group.
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
Important
|
Deploying Kyvos in a new Resource Group
Anchor permissions permissions The Managed Application Contributor Role must be assigned to the user at the subscription level .
Deploying Kyvos using the existing Resource Group Group
The Owner owner role must be assigned to the user on the Resource Group in which Kyvos is being created.
The custom role must be assigned to the user at the subscription level. Contact your administrator to create or share the name of the custom role.
Register Microsoft Resource Providers at the Subscription Level
To deploy Kyvos, ensure that the following Microsoft Resource Providers are registered at the subscription level.
To learn about how to verify and register Resource Provider, see the Verifying and Registering Microsoft Providers section.
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
ImportantIf you are unable to register Microsoft Resource Providers, contact your Azure Account Administrator to do so. |
Microsoft Resource Providers
Microsoft.Storage
Microsoft.Compute
Microsoft.ManagedIdentity
Microsoft.Network
Microsoft.KeyVault
Microsoft.insights
Microsoft.Web
Microsoft.Databricks
...
The Network Contributor role must be assigned to the user. See the Configuring Roles for Deployment User section for details on creating and assigning roles.
...
Two subnets must be available for the deployment of Kyvos. These subnets must be within the required CIDR Range for the deployment of Kyvos Azure Marketplace:
Subnet for Kyvos Instances: /16 to /26
Subnet for Application Gateway: /16 to /27
No subnet delegations attached to any of the subnets.
Service Endpoints are required on the Subnet for Kyvos Instances:
Azure Storage (Microsoft.Storage): This model secures and controls the level of access to your storage accounts so that only applications requesting data over the specified set of networks or through the specified set of Azure resources, can access a storage account.
Azure Key Vault (Microsoft.KeyVault): The virtual network service endpoints for Azure Key Vault allow you to restrict access to a specified virtual network and a list of IPv4 (Internet Protocol version 4) address ranges.
Azure App Service (Microsoft.Web): By setting up access restrictions, you can create a priority-ordered allow/deny list to control network access to your application.
Databricks Configurations
Anchor | ||||
---|---|---|---|---|
|
The Kyvos application requires a Databricks cluster for building cubes and data profiling jobs.
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
Note
|
Following are the common prerequisites when using the Azure Marketplace wizard to create a new Databricks cluster or to use your own/existing Databricks cluster.
Workspace URL: See See Microsoft Documentation to learn about how to get a workspace URL.
Anchor AzuredatabricksPAT AzuredatabricksPAT Azure Databricks Personal Access Token: You can use an existing token or create a new token.
To create an Azure Databricks personal access token for an Azure Databricks user, perform the following steps.Login to your Azure Databricks workspace by using the URL you obtained from Step 1.
NOTE: If you are unable to log in, contact your administrator to give you access. Refer to Microsoft documentation to learn about how to add a user to your Azure Databricks account using the account console.In your Azure Databricks workspace, click your Azure Databricks username in the top right bar, and then select User Settings from the list.
On the Access tokens tab, click Generate new token.
Optionally, enter a comment that helps you to identify this token in the future and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).Click Generate.
Copy the displayed token, and then click Done.
Anchor ServicePrincipal ServicePrincipal
Service Principal: You need Service Principal Client ID, Service Principal Client Secret, and Tenant ID . You can either create a new Service Principal or use an existing Service Principal.
New Service Principal
You must have the required rights to create a new Service Principal.
Refer to the Provision a service principal in Azure portal section to learn about how to create an Azure AD Service Principal and obtain the Service Principal Client ID (also known as Application (client) ID), the Service Principal Client Secret and Tenant ID (also known as Directory (tenant) ID).
Existing Service Principal
If you do not have permission to view the Service Principal (App registration), contact your administrator.
To get the details from the existing Azure AD Service Principal, see the Getting Azure AD Service Principal Details section.
Anchor ServicePrincipalObject ServicePrincipalObject
Service Principal Object ID:
To obtain this, search App registration in Azure Portal > Search and select your Service Principal , which you have created or used in Step 3, as explained above.
Click the value of Managed application in local directory. The overview page will display the Object ID of the Service Principal.
To use the existing Databricks cluster, refer to Databricks documentation to learn about how to get the Cluster ID.
...
Log in to Databricks.
Click Compute.
Select the cluster that you want to configure in Kyvos.
Configure the existing Databricks cluster as follows:
Parameters Description Databricks Runtime Version
Kyvos supports:
Version 7.3 LTS (includes Apache Spark 3.0.1, Scala 2.12)
Version 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12)
Version 10.4 LTS (includes Apache Spark 3.1.2, Scala 2.12)
Autoscaling Options
Enable autoscaling: Select this to enable autoscaling.
Terminate after ___ minutes of inactivity: Set the value as 30.
Worker type
Recommended type Standard_E16ds_v4
Min Worker: Recommended value 1
Max Workers: Recommended value 10
To use Databricks with Spot Instances, select thecorresponding checkbox. (not recommended for production use)
Driver Type
Recommended type Standard_E8ds_v4
Anchor Step5 Step5 In the Advanced Options, define the Spark Configurations as follows:
Sample configuration:Code Block spark.sql.parquet.int96AsTimestamp true spark.databricks.delta.preview.enabled true spark.hadoop.spark.sql.parquet.binaryAsString false spark.hadoop.fs.azure.account.oauth2.client.secret {Service-Principal-Client-Secret} spark.databricks.preemption.enabled false spark.hadoop.fs.azure.account.oauth2.client.endpoint https://login.microsoftonline.com/{Tenant-ID}/oauth2/token spark.sql.parquet.binaryAsString false spark.databricks.service.server.enabled true spark.hadoop.fs.azure.account.oauth2.client.id {Service-Principal-Client-ID} spark.hadoop.fs.azure.account.auth.type OAuth spark.hadoop.fs.azure.account.oauth.provider.type org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider spark.hadoop.spark.sql.parquet.int96AsTimestamp true
...