Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Applies to: (tick)Kyvos Enterprise  (error) Kyvos Cloud (SaaS on AWS) (error) Kyvos AWS Marketplace

...

Panel
panelIconIdatlassian-info
panelIcon:info:
bgColor#FFFAE6

Important

Download the AWS Installation Files folder and keep all the requisite files handy during installation and deployment. 

  • 2024.3 AWS Installation Files

  • 2024.3.1 AWS Installation Files

  • 2024.3.2 AWS Installation Files

    Common prerequisites

    Regardless of the type of installation, the following prerequisites should be available.

    1. EC2 key pair, consisting of a private key and a public key. You can create the key pair if needed.

    ...

    1. You must have the Access Key and Secret Key to access the Kyvos bundle. Contact Kyvos Support for details.

    2. Valid Kyvos license file.

    3. Databricks cluster with the following parameters:

      1. Databricks Runtime Version: Select 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12)

      2. Autopilot Options: Select the following:

        1. Enable autoscaling: Select this to enable autoscaling.  

        2. Terminate after ___ minutes of inactivity. Set the value as 30.  

      3. Worker type: Recommended value r5.4xlarge  

        1. Min Workers: Recommended value 1

        2. Max Workers: Recommended value 10

      4. Driver Type: Recommended value r 5.xlarge  

      5. Advanced options   

        1. By default, the Spot fall back to On-demand checkbox is selected. Kyvos recommends you clear this checkbox.

        2. In the Spark Configurations define the following property in case of Glue-based deployment.

          • spark.databricks.hive.metastore.glueCatalog.enabled=true  

        3. If cross-account glue is to be used, then define the following property to access cross-account glue: spark.hadoop.hive.metastore.glue.catalogid <GLUE_CATALOG_ID>  

        4. After these, set the below parquet-specific configuration properties:  

          • spark.hadoop.spark.sql.parquet .int96AsTimestamp true  

          • spark.sql.parquet.binaryAsString false  

          • spark.sql.parquet .int96AsTimestamp true  

          • spark.hadoop.spark.sql.parquet.binaryAsString false

          • spark.databricks.preemption.enabled false

          • spark.sql.caseSensitive false

          • spark.hadoop.spark.sql.caseSensitive false

        5. You must change Spark configurations to use managed disk. Ensure that you must not change the configuration in the default root (/tmp) volume.

          1. In the Spark Configurations, add the spark.local.dir /local_disk0 property where the local_disk0 is the managed disk.

          2. Optionally, you can execute the df -h command from a notebook for verification.

          3. Add the SPARK_WORKER_DIR=/local_disk0 value in the Environment variables.

      6. Tags: Owner and JIRA tags are required to run the cluster.  

      7. Instance profile: Copy the Instance Profile ARN of the role created earlier (Point 2 of the Permission requirements).  

        1. In Databricks console, go to Admin Console > Instance Profile and click Add Instance Profile. Paste the Instance Profile ARN in the text box.  

        2. Select the Skip Validation checkbox and then click Add.  

        3. In Cluster settings, go to Advance Options, and in Instance Profile field, select the instance profile created above.  

    4. Databricks information:

      1. Databricks Cluster Id: To obtain this ID, click the Cluster Name on the Clusters page in Databricks.  
        The page URL shows <https://<databricks-instance>/#/settings/clusters/<cluster-id>. The cluster ID is the number after the /cluster/ component in the URL of this page.

      2. Databricks Cluster Organization ID: To obtain this ID, click the Cluster Name on the Clusters page in Databricks.  
        The number after o= in the workspace URL is the organization ID. For example, if the workspace URL is https://westus.azuredatabricks.net/?o=7692xxxxxxxx, then the organization ID is 7692xxxxxxxx.

      3. Databricks Role ARN: Use the ARN of the Databricks-instanceprofile-role created earlier (Point 2 of the Permission requirements). 
        The ARN looks like this: arn:aws:iam ::45653****** *:role /AssumeRoleTest
        This Databricks Role should have " iam:PassRole" permission in the role you have created for the Databricks workspace.  

    5. If using an existing Secrets Manager, ensure that the KYVOS-CONNECTION-DATABRICKS-TOKEN key is added to it.

    Using Kyvos Public AMI

    In addition to the prerequisites mentioned in the Common section, you must have the following:

    ...

    1. AWS-CLI should be installed on all Kyvos Instances. To install perform the following steps.

      1. Install zip to unzip AWS CLI setup.

        Code Block
        yum install unzip
      2. Execute the following commands:

        Code Block
        curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
        unzip awscliv2.zip
        sudo ./aws/install
      3. Open /etc/bashrc in any of the command line editors and add the below text in the file:

        Code Block
        export PATH=/usr/local/bin/:$PATH
        source /etc/bashrc
    2. Increase ulimit of Kyvos user on all nodes using the command:

      Code Block
      echo “kyvos hard nofile 10240” >> /etc/security/limits.conf 
      echo “kyvos hard nofile 10240” >> /etc/security/limits.d/20-nproc.conf
      echo “kyvos soft nofile 10240” >> /etc/security/limits.conf 
      echo “kyvos soft nofile 10240” >> /etc/security/limits.d/20-nproc.conf 
    3. Install open-ssl on Kyvos Manager node to enable TLS using the command:

      Code Block
      yum install openssl
    4. Download and copy the file to /sbin/

      Code Block
      cd /sbin/ && wget https://expanse.kyvosinsights.com/s/dKEtQeQLnszNwL6/download -O ebsnvme-id 
      chmod a+x /sbin/ebsnvme-id
    5. Create below directories on all nodes with 750 permissions and make Kyvos user owner of these directories.

      Code Block
      /data/kyvos/app
      /data/kyvos/installs
    6. Create the directories and assign 777 permissions on them:

      Code Block
      sudo mkdir -p /mnt/s3 
      sudo mkdir /mnt/tmp
      chmod -R 777 /mnt
    7. Ensure that the required ports are available.

    8. Ensure that the required OS Commands used by Kyvos Manager are available on all the machines.