Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Applies to:Image RemovedKyvos Enterprise  Image RemovedKyvos Cloud (Managed Services on AWS)  Image RemovedKyvos Azure Marketplace

Image RemovedKyvos AWS Marketplace  Image RemovedKyvos Applies to: (tick) Kyvos Enterprise  (error) Kyvos Cloud (SaaS on AWS) (error) Kyvos AWS Marketplace

(error) Kyvos Azure Marketplace   (error) Kyvos GCP Marketplace (error) Kyvos Single Node Installation (Kyvos SNI)   Image RemovedKyvos Free (Limited offering for AWS)

...

To create a Dataproc cluster, perform the following steps.

...

Panel
panelIconIdatlassian-note
panelIcon:note:
bgColor#DEEBFF

Note

Download the GCP Installation Files folder and keep the files handy before proceeding.

  1. On your Google console, click Dataproc

...

  1. > Create Cluster

  2. Provide cluster Name.

  3. Select Region and Zone the same as the one selected for the Kyvos instances.

  4. Select Cluster type as Standard (1 master, N workers).

...

  1. Image Added

  2. Optionally, select Autoscaling policy. It is not mandatory to select an autoscaling policy while creating Dataproc. You can also add it after creating Dataproc.

    To attach an autoscaling policy to your cluster after creation, follow the steps given in the Enabling Autoscaling on cluster section
    Kyvos recommends attaching the autoscaling policy to the Dataproc cluster.

  3. Under Versioning, use the Change button to select the Image Type and Version for the nodes.
    Optionally, select any of the Kyvos supported versions.

...

  1. Image Added
  2. Click Configure nodes on the left, and in the Master Node area define Machine configuration as:
    Series: N2D
    Machine Type – n2-highmem-4 (4 vCPU and 32 GB)
    Provide Primary Disk type and Disk size

...


  1. Image Added
  2. Configure the Worker Node with Machine Configuration with Minimum Recommendation as:
    Series: N2D
    Machine Type: n2-highmem-8 (8 vCPU and 64 GB)
    Provide Primary Disk type and Disk size as 500 GB with Type Standard Persistent Disk
    Node Count – As needed
    Local SSD – 0 (default)

...

  1. Image Added

  2. Click Customize Cluster on the left, and define the Network Configuration as:
    NOTE: You must specify the same network details as used for the Kyvos Instances.

    1. Under Internal IP Only, select the Configure all instances to have only internal IP addresses checkbox.

    2. In the Dataproc Metastore, select the Metastore service from the list to use Dataproc Metastore as Hive metastore.

...

    1. Image Added
  1. Creating Initialization Actions

    1. For SSH disabled Dataproc environment, you can provide the dataproc.sh script (provided in the GCP Installation

...

    1. Files folder in gcp.tar)

...

    1.  to ensure that your snapshot bundles are uploaded to the bucket. For this, go to Initialization actions and click the Add initialization Action button and use the Browse button to select the dataproc.sh script.
      NOTE: The dataproc.sh script must be available in your bucket.

    2. For Livy Server enabled Dataproc environment, you can provide the livyserver.sh script (provided in the GCP Installation

...

    1. Files folder in gcp.tar) to ensure that the Livy Server is deployed with the Dataproc cluster. For this, go to Initialization actions and click the Add initialization Action button and use the Browse button to select the livyserver.sh script.
      NOTE: The livyserver.sh script must be available in your bucket.

...

    1. Image Added
  1. In the Cloud Storage staging bucket area, select the Cloud Storage bucket that you want to use.

  2. Click the Create button on the left.

...

  1. Image Added
  2. Once the cluster is created, stop the Dataproc VM Instances and click the instances > Edit. Scroll to the SSH Keys section and provide the same public key which was used for the Kyvos instances. NOTE: The Service Account attached to Dataproc must have a Dataproc Worker role assigned to it.

  3. Connect to the Master node and create Kyvos directories on HDFS using the following commands.

    Code Block
    hadoop fs -mkdir -p /user/kyvos/temp
    hadoop fs -chmod -R 777 /user/kyvos

...

Panel
panelIconIdatlassian-note
panelIcon:note:
bgColor#DEEBFF

Note

Once created, you can validate if the resources meet the requirements for installing Kyvos on the Google cloud platform.

To deploy the Kyvos cluster using password-based authentication for service nodes, ensure that the permissions listed here are  are available on all the VM instances for Linux user deploying the cluster.

To deploy the Kyvos cluster using custom hostnames for resources, ensure that the the steps listed here are completed here are completed on all the resources created for use in the Kyvos cluster.

Next: Deploy the Kyvos GCP cluster through Kyvos Manager