Applies to: Kyvos Enterprise Kyvos Cloud (SaaS on AWS) Kyvos AWS Marketplace
...
Create a new node for Kyvos Manager, and ensure the following:
This node should have the same set of permissions in terms of roles, tags (UsedBy / CreatedBy, CLUSTER_ID, ROLE : KM, LAYER : KM_Service), network access rules and permissions (VirtaulNetwork, Subnet, Security Group, Resource Group), credentials, size and instance type, disk organization (mount point, disks, directories where Kyvos Manager and Kyvos installed) as that of the original Kyvos Manager node which doesn’t exist anymore.
For access purposes, you need to either add the same security group or the security group added must have the same set of access rules and permissions.
If Secrets Manager/Key Vault is in use, then ensure that the roles assigned to the new Kyvos Manager node have access to the Secrets Manager/Key Vault.
Ensure that roles assigned to the new Kyvos Manager node have access to the S3 bucket/ABFS account.
If the Kyvos Manager node is created by attaching a disk image of any old Kyvos Manager node, then ensure the below in mentioned sequence:
Agent service is stopped on that node.
Agent cron entry deleted from crontab.
Kyvos Manager Agent and Kyvos folders were deleted from it.
The OS commands must be present in the path of a non-interactive login session for the user account used to log in to the nodes.
To restore Kyvos Manager on the new node, download a script file named disaster-recovery-kyvosmanager.sh from the DFS at path <engine_work>/setup/scripts/ and execute that script. This will restore the Kyvos Manager server and the Kyvos Manager service will start automatically.
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
Note
|
Disaster recovery through the guided flow on Kyvos Manager
...
Click the Uninstall button corresponding to Step 1: Uninstall Zookeeper in the Restore Cluster area.
On the displayed confirmation dialog box, provide your Kyvos Manager password, and click the Uninstall button.
A new browser tab is opened, showing add node operation details and status. You may switch back to the Disaster Recovery browser tab.
Once the operation is completed, you will see the status shown in the following figure. At this point, you will be able to perform the next step for deleting the offline nodes.Click the Delete button corresponding to Step 2: Delete Offline Nodes.
From the Delete Offline Nodes dialog box, select the nodes you want to delete and provide your Kyvos Manager Password.
Note that you will see only the Offline nodes in this list.Click the Delete button.
NOTE: Once deleted, nodes cannot be retrieved.
A new browser tab is opened, showing add node operation details and status.
You may switch back to the Disaster Recovery browser tab.
Once the operation is completed, you will see the status shown in the following figure. At this point, you will be able to perform the next step for adding new nodes.Click the Add button corresponding to Step 3: Add Nodes.
On the Add Nodes to Cluster dialog box, provide the Node Name or IP Address, and click the Add to List button.
You can add as many new nodes with desired roles (all roles not listed in the image) as you need.Once done, provide your Kyvos Manager Password, and click the Add button.
A new browser tab is opened, showing add node operation details and status. You may switch back to the Disaster Recovery browser tab.
Once the operation is completed, you will see the status shown in the following figure. At this point, you will be able to perform the next step for installing Zookeeper.
Click the Install button corresponding to Step 4: Install Zookeeper.
Provide your Kyvos Manager Password on the confirmation boxand click the Install button.
A new browser tab is opened, showing uninstall Zookeeper operation details and status. You may switch back to the Disaster Recovery browser tab.
Once the operation is completed, you will see the status shown in the following figure. At this point, you will be able to perform the next step for switching the repository.Click the Switch button corresponding to Step 5: Switch Repository. You will be redirected to the Switch Repository page.
Refer to the Manage Kyvos Repository section to learn more.
...
Panel |
---|
...
Steps for Manual Recovery of Kyvos Manager Node and Roles on it
Create a new node for Kyvos Manager, and ensure the following:
This node should have the same set of permissions in terms of roles, tags (UsedBy / CreatedBy, CLUSTER_ID, ROLE : KM, LAYER : KM_Service), network access rules and permissions (VirtaulNetwork, Subnet, Security Group, Resource Group), credentials, size and instance type, disk organization (mount point, disks, directories where Kyvos Manager and Kyvos installed) as that of the original Kyvos Manager node which doesn’t exist anymore.
For access purposes, you need to either add the same security group or the security group added must have the same set of access rules and permissions.
If Secrets Manager/Key Vault is in use, then ensure that the roles assigned to the new Kyvos Manager node have access to the Secrets Manager/Key Vault.
Ensure that roles assigned to the new Kyvos Manager node have access to the S3 bucket/ABFS account.
Download snapshots (KyvosManager, KyvosManager data, and KyvosManager DB) from DFS on the above-created node. For downloading these snapshots, refer Snapshot bundles table below to know the location of the folder on DFS. You can find the URL for downloading individual snapshot bundles at the Azure portal by navigating to that folder within the container of the ABFS account that is used in deployment.
Azure: For this to work with the below provided commands, ensure identity is already attached on the newly created Kyvos Manager node.
azcopy login --identity
azcopy cp ABFS-folder-path local-pathAWS:
aws s3 cp s3-path local-path
Untar these bundles in the same above-mentioned order at the same respective paths as they were in the original Kyvos Manager node.
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
Note The expected path of the kyvosmanager and kyvosmanagerdata folders can be cross-checked with variables configured in the setenv.sh file at kyvosmanager_war/kyvosmanager/setenv.sh after untar of km_snapshot.tar.gz.
|
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
Note Extract the Kyvos Manager DB snapshot tar by keeping it parallel to the kyvosmanagerdata folder. This will ensure that after untar of km_db_snapshot.tar.gz, the ankushdb folder is created at the kyvosmanagerdata/server/db/ location. |
Start Kyvos Manager using the startup.sh script.
On the Kyvos Manager, navigate to the Kyvos Manager > Settings, and perform the following steps.
In the Kyvos Manager Server Details area, click Reconfigure.
Update the Hostname and Port for Kyvos Manager.
Click the Validate button. You will see a validation error Server accessibility failed from 1 node. This is due to the unavailability of the old Kyvos Manager node.
Click Apply.
Navigate to the Dashboard. The cluster dashboard will show Unable to get license info error (see image below). Ignore it till a new KM node gets added to the cluster.
Stop Kyvos component services using the Actions menu for each component.
Click Manage Kyvos > Disaster Recovery on the navigation pane.
Depending on the current state of the system, you may see up to 3 links.If Kyvos Manager-managed multi-node Zookeeper was deployed, then the first link will appear for Zookeeper removal. For a single-node Kyvos Manager-managed zookeeper, no such link will appear.
Then, you will see a link for removing the old Kyvos Manager node, which is no longer available.
Thereafter, you will see a link to Add a new (current) Kyvos Manager node.
Warning You MUST click the links in the same order as they are listed (i.e., first remove the Zookeeper (if applicable), then remove the unreachable node, and finally add the new node.
Remove Zookeeper using the link.
Remove the old Kyvos Manager node using the Remove Unreachable Node link. This initiates the Remove Node operation for removing the node having a WebPortal role (and Postgres role if bundled Repository is being used) from the cluster. You will be redirected to the Remove Node operation details page.
Go to the Disaster Recovery page, and perform the following steps.
Click the Add Node link to add a new Kyvos Manager node to the cluster. This will initiate the Add Node operation for adding the Web Portal role on the new Kyvos Manager node. You will be redirected to the Add Node operation details page.
In case of any failure, re-perform this operation.
On successful completion of this operation, the Kyvos folder will be added to this node.
If bundled Postgres was in use, then:
Download the Postgres snapshot bundle from binaries, and by deleting the existing Postgres folder on the KM node, set up the extracted folder from this snapshot as Postgres on the node. This Postgres snapshot needs to be extracted by copying it in parallel to the kyvos foler.
Download the latest/applicable Postgres dump bundle from DFS (from the data folder) to the new Kyvos Manager node.
Start Postgres service on the Kyvos Manager node.
Import the dump in the Postgres instance (see the Manage Kyvos Repository section)
On the Switch Repository page, configure the bundled repository on the Kyvos Manager node (see the Manage Kyvos Repository section).
If any additional nodes are impacted, then:
Remove those nodes using the Delete Node functionality of Kyvos Manager.
Add the newly created node with the required roles on it.
For cloud-based clusters, add Zookeeper to the cluster depending on how Zookeeper was used earlier.
If non managed Zookeeper was in use, then configure a new KM node ip:2181 as a value for the Zookeeper string from the Hadoop Ecosystem configuration page.
If Kyvos Manager-managed Zookeeper was in use, then deploy the Zookeeper component from the Hadoop Ecosystem configuration page.
Start Kyvos Component services from the Dashboard using the Actions menu.
Panel | |||||||
---|---|---|---|---|---|---|---|
| |||||||
Important
|
Checkpoints
Some important checkpoints that you must verify after completing the disaster recovery process.
...
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
Note
|
...
These approaches are mainly when KM Kyvos Manager was originally created using automated deployment. The original template can be referred for machine type, image related details for the original Kyvos Manager machine:
...