Document toolboxDocument toolbox

Ranger deployment for Kyvos AWS environment

Applies to: Kyvos Enterprise  Kyvos Cloud (SaaS on AWS) Kyvos AWS Marketplace

Kyvos Azure Marketplace   Kyvos GCP Marketplace Kyvos Single Node Installation (Kyvos SNI)


Version information

Kyvos supports Ranger deployment for:

  • EMR version - EMR-5.32.0

  • Hive version - 1.2

  • Spark version - 2.4

Deploying Ranger on AWS using CloudFormation Templates

To deploy Ranger on AWS, perform the following steps.

  1. Log in to your AWS Console.

  2. From the top-left of the screen, click the Create Stack > With New Resources (Standard) option.

  3. Create a  stack on AWS using the rangerStep1.yaml template.

  4. Once the stack is created, you will receive the following on the Outputs tab.

    1. VPC

    2. Bastion host

    3. Domain controller, AD
      Copy these values and keep them handy at the time of Kyvos deployment.

  5. Now, set up the Apache Ranger server, Amazon Relational Database Service (Amazon RDS) instance, and EMR cluster by using the CloudFormation template.

  6. Once this stack is created, you will receive the following on the Outputs tab.

    1. Ranger Server using created AD

    2. RDS instance

    3. Kerberized EMR cluster (Shared EMR without Glue)
      Copy these values and keep them handy at the time of Kyvos deployment.

  7. Now deploy Kyvos using the outputs received from the above steps.

Caution

While deploying Kyvos, provide the values of VPC and Kerberized EMR as received in the Outputs tabs of the above steps

  1. Using the root user, execute the following command at all Kyvos nodes, i.e., nodes for BI Server, Kyvos Manager, and Query Engines.

    yum install krb5-workstation

Post Deployment steps

Once you have deployed Kyvos on AWS, perform the following steps.

  1. On the EMR node, create a path as /user/kyvos and assign the owner as kyvos on it.

  2. After deployment, create a Ranger policy to give all the permissions to Kyvos service users on the S3 Kyvos deployment directory (S3 bucket).

  3. After a successful deployment of Kyvos, the Query Engine services will not come up. To turn ON the Query Engine services, you have to manually update the krb5.conf file on all the nodes (BI Server, Kyvos Manager, and Query Engine). For this, copy the content of krb5.conf file from the Kerberized EMR and paste it on all the deployed nodes. After doing this, Query Engines services will come up.

  4. Delete the emrfs-site.xml to disable secret agent validation.

  5. Manually copy the hive-site.xml file to /olapengine/connections/DefaultHadoopCluster01/conf/hive_conf location.

  6. Add Kyvos as a value in the hive.metastore.authorized.users property in the hive-site.xml file.

  7. Flush the iptables from the core node. In case you add any nodes later, you must again flush the iptables.

  8. In the cluster.properties file available at /tmp/rollout (or /tmp/) location, add the QE_MAX_CAPACITY_INSTANCE_TYPE property and set the value as r5.8xlarge, as:
    QE_MAX_CAPACITY_INSTANCE_TYPE=r5.8xlarge

  9. Now upload the cluster.properties file using dfs utility. For this, perform the following steps.

    1. Go to olapengine/bin path.

    2. Run the following command

      ./dfsutility.sh write /tmp/rollout/cluster.properties <work_dir>/setup/cloud_conf/

Example: ./dfsutility.sh write /tmp/rollout/cluster.properties /user/engine_work/setup/cloud_conf/

  1. See the Using semantic model properties section to view details on the properties that you need to apply on the semantic model.

Points to remember

  1. You cannot clone the Kerberized EMR.

  2. Auto-scaling is not supported for Kerberized EMR node.

Copyright Kyvos, Inc. All rights reserved.