Document toolboxDocument toolbox

Disaster Recovery and Backup on Azure

Disaster recovery (DR) is the ability of the product to restore access and functionality to infrastructure after a disaster occurs in a region, whether natural or caused by human action (or error).

Kyvos offers disaster recovery of its components in Azure by choosing the recovery options available for respective services in Azure Cloud. Disaster Recovery in Azure is only supported when an External Postgres Repository is used for Kyvos.

Important

Ensure that only Kyvos data can be recovered and not Kyvos Manager data.

Prerequisites

The following are the settings that you need to enable before performing disaster recovery.

  • RA-GRS should be enabled on the primary region’s storage account.

Note

When you enable RA-GRS, the system will give you a secondary location.

Pic3.png
  • The service principle attached to the Databricks cluster must have storage blob data contributor permission on the above-created storage/container.

  • To create a read replica of the Postgres Flexible Server, an existing Virtual Network peered with the Primary region’s Virtual Network is required in the DR region.

  • A subnet with delegation to flexible Servers in the Virtual network of the DR region is required.

  • All the other disaster resources should be created in the DR region.

  • Resource Groups and Virtual networks are required in the DR region.

  • Read a replica of the Flexible server that should be created in the DR region.

Configuring Disaster Recovery for Kyvos Services

Storage Account

Migrate Redundancy: To configure DR for the Storage Account, the user has to migrate the Redundancy option of the Storage Account. (Follow the Steps to change the Storage Redundancy). After the Redundancy option has been migrated to RA-GRS, a Secondary Storage Blob Service Endpoint will be available where data is continuously Replicated. You will use the above endpoint in a disaster and copy data to another Storage Account in the same DR region.

The following image depicts the Primary region as Central US and the DR region as East US 2.

Pic2.png

Note

Azure provides a failover mechanism where the Storage Account shifts to the Secondary region in case of a disaster. Still, this feature is currently unavailable for the Storage Accounts with Hierarchical namespace enabled, and it's a prerequisite in Kyvos to use a Storage Account with Hierarchical namespace enabled.

Kyvos External Repository (Azure Postgres Flexible Server)

Read Replicas: Cross-region read replicas can be deployed to protect your databases from region-level failures. Read replicas are updated asynchronously using PostgreSQL physical replication technology. Follow Microsoft documentation to configure Replication in an existing Postgres Flexible Server.

Key Vault

Key Vault automatically manages Disaster recovery: If you're in a region that automatically replicates your key vault to a secondary region, then in the rare event that an entire Azure region is unavailable, your requests of Azure Key Vault in that region are automatically routed (failed over) to a secondary region. When the primary region is available again, requests are routed back (failed back) to the primary region. Again, you don't need to take any action because this happens automatically. See Microsoft documentation to know more about failover across regions.

Configuring failover if disaster occurs in primary region

Storage account

  1. Create a new Storage Account using the ARM template in the DR region (refer to Fig 1 for getting DR region value).

  2. Execute the following command to copy the data from the Secondary Storage Blob Service Endpoint (created when you enabled the RA-GRS redundancy option) to the above-createdoption) to the above created Storage Account.

    azcopy copy "<source_URL>" "<destination_URL>" --recursive=true

For example,
azcopy copy "https://qakyvosprim15j-secondary.blob.core.windows.net/qakyvosconatiner/?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupyx&se=2024-01-15T13:50:42Z&st=2024-01-15T05:50:42Z&spr=https&sig=A3cgHxOI8xW3jL1nzZf6mhDKVww584cNMQO5DIhqQfs%3D" "https://drbucket15.blob.core.windows.net/kyvoscontainer/?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwdlacupyx&se=2024-01-15T13:58:25Z&st=2024-01-15T05:58:25Z&spr=https&sig=YX0Pr9cqn8kO5wox4WnX1ap17%2F55ztqxDcOzGWWWEwU%3D" --recursive=true

If the primary region’s Storage Account enables networking (Enhanced Security enabled while creating a template), allow Kyvos instance and Function’s subnet of DR region on Networking. See Microsoft documentation to configure Azure Storage firewalls and virtual networks.

Kyvos External Repository (Azure Postgres Flexible Server)

Promote replicas to an independent server: In case of a disaster, the user must promote a read replica to a Standalone Postgres using the Azure document.

Key Vault

If Networking is enabled on the Key Vault used in the DR deployment, allow the subnets of Kyvos instances & Functions in the Networking. For more details, see Microsoft documentation to configure Azure Storage firewalls and virtual networks.

Kyvos Deployment

  1. Create a template from Kyvos Manager template creation with an Existing Storage Account, Existing Postgres Flexible Server, and Existing Key vault.

  1. Open the downloaded template in a Text Editor. Search for EnableDR and change its value from ‘false’ to ‘true’.

  2. While deploying ARM deployment, enter the below details as follows:

    1. Storage Account Name: Enter the name of the Storage Account created in the DR Region in the Storage Account Name.

    2. Key Vault Name: Enter the DNS name of the existing Key vault in the primary region

    3. Kyvos Postgres Server Name: Enter the name of the Postgres Flexible Server promoted from Read replica.

    4. Provide the same engine work directory as the primary deployment.

Points to remember

  • If DR happens, then you cannot move to the Original installation. The DR cluster will be the Primary cluster.

  • If you have configured additional settings for the primary cluster, in this case you need to perform the following settings on the secondary cluster as Primary cluster.

    • Once the deployment is complete, you MUST change the ADLS GEN2 storage name in all the datasets, as the raw data storage is also changed due to DR.

    • If the primary deployment was on a private network (tunneling established between Customer and Kyvos AZURE VNET), you must repeat the same procedure after DR deployment.

    • Once the deployment is complete, you must wait for Cuboid replication on all the query engines to execute queries.

    • Once the deployment is complete, you must enable LDAP, SSO, SMTP, TLS, and SSL same as you have done for the the primary cluster cluster.

    • If any additional IPs were allowed in the Security group of primary installation, you MUST configure the same in the DR Security Group, too.

    • Once the DR deployment is complete, you must create the custom URL and DNS mapping again.

    • You must manage the Glue tables and source data after the DR deployment.

Storage Account Template

{ "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#", "contentVersion": "1.0.0.0", "parameters": { "StorageAccountName": { "type": "string", "defaultValue": "drbucket", "metadata": { "description": "Name of Storage Account to be used." } }, "StorageAccountContainerName": { "type": "string", "defaultValue": "kyvoscontainer", "metadata": { "description": "Name of Container in Storage Account." } }, "MultiAzStorageAccount": { "type": "bool", "defaultValue": false, "allowedValues": [ true, false ], "metadata": { "description": "Select True to Create New Managed Identity for kyvos." } }, "AdditionalTags": { "type": "object", "metadata": { "description": "Additional tags to put on all resources. Syntax: {\"Key1\": \"Value1\", \"Key2\" : \"Value2\"}" }, "defaultValue": { "UsedBy": "Kyvos" } } }, "variables": { "TagMap": { "LayerTag": { "WebServer": "Kyvos_WebPortal", "OlapEngine": "Service", "QueryEngine": "Query", "StorageAccount": "Persistent_Storage", "KyvosManager": "KM_Service", "Function": "Scale_Layer", "Vault": "Secrets", "ManagedIdentity": "Authentication", "AzurePostgresServer": "Metadata_Storage", "LogsStorageAccount": "Logs_Storage", "CreditInfoPostgres": "CreditInfo_Metadata_Storage", "CreditInfoKeyVault": "CreditInfo_Secrets_Storage", "Vnet": "Networking", "LogWorkspace": "Logging", "PrivateEndpoint": "Connection" }, "RoleTag": { "WebServer": "WP_CLUSTER", "OlapEngine": "BI_CLUSTER", "QueryEngine": "QE_CLUSTER", "StorageAccount": "STORAGE", "KyvosManager": "KM", "Function": "KYVOS_FUNCTION", "Vault": "SECRETS_MANAGER", "ManagedIdentity": "RESOURCES_ACCESS", "AzurePostgresServer": "DATABASE", "AzurePostgresServerKmRepo": "DATABASE_KM", "LogsStorageAccount": "LOGS_DATA", "CreditInfoPostgres": "CREDITINFO_DATABASE", "CreditInfoKeyVault": "CREDITINFO_PASSWORDS", "Vnet": "NETWORK", "LogWorkspace": "LOGGING", "PrivateEndpoint": "CONNECTION" } } }, "resources": [ { "type": "Microsoft.Storage/storageAccounts", "apiVersion": "2022-09-01", "name": "[parameters('StorageAccountName')]", "location": "[resourceGroup().location]", "sku": { "name": "[if(parameters('MultiAzStorageAccount'), 'Standard_ZRS', 'Standard_LRS')]", "tier": "Standard" }, "tags": "[union(parameters('AdditionalTags'),json(concat('{\"CLUSTER_ID\": \"kyvos-', deployment().name, '\" , \"CreatedBy\": \"Kyvos\", \"Name\": \"kyvos-storage-', deployment().name, '\" , \"ROLE\": \"', variables('TagMap').RoleTag.StorageAccount, '\" , \"LAYER\": \"', variables('TagMap').LayerTag.StorageAccount, '\"')))]", "kind": "StorageV2", "properties": { "largeFileSharesState": "Disabled", "isHnsEnabled": true, "networkAcls": { "bypass": "AzureServices", "virtualNetworkRules": [], "ipRules": [], "defaultAction": "Allow" }, "supportsHttpsTrafficOnly": true, "encryption": { "services": { "file": { "keyType": "Account", "enabled": true }, "blob": { "keyType": "Account", "enabled": true } }, "keySource": "Microsoft.Storage" }, "minimumTlsVersion": "TLS1_2", "accessTier": "Hot", "allowBlobPublicAccess": false } }, { "type": "Microsoft.Storage/storageAccounts/blobServices", "apiVersion": "2022-09-01", "name": "[concat(parameters('StorageAccountName'), '/default')]", "dependsOn": [ "[resourceId('Microsoft.Storage/storageAccounts', parameters('StorageAccountName'))]" ], "properties": { "cors": { "corsRules": [] }, "deleteRetentionPolicy": { "enabled": false } }, "tags": "[union(parameters('AdditionalTags'),json('{}'))]" }, { "type": "Microsoft.Storage/storageAccounts/blobServices/containers", "apiVersion": "2022-09-01", "name": "[concat(parameters('StorageAccountName'), '/default/',parameters('StorageAccountContainerName'))]", "dependsOn": [ "[resourceId('Microsoft.Storage/storageAccounts/blobServices', parameters('StorageAccountName'), 'default')]", "[resourceId('Microsoft.Storage/storageAccounts', parameters('StorageAccountName'))]" ], "properties": { "publicAccess": "None" }, "tags": "[union(parameters('AdditionalTags'),json('{}'))]" } ], "outputs": { } }

 

Copyright Kyvos, Inc. All rights reserved.