Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

Applies to:Image ModifiedKyvos Enterprise  Image ModifiedKyvos Cloud (Managed Services on AWS)  Image ModifiedKyvos Azure Marketplace

Image ModifiedKyvos AWS Marketplace  Image ModifiedKyvos Single Node Installation (Kyvos SNI)  Image ModifiedKyvos Free (Limited offering for AWS)

...

To create a Dataproc cluster, perform the following steps.

Info

Download the GCP Installation Files folder and keep the files handy before proceeding.Kyvos 2023.2 GCP Installation Files
Kyvos 2023.2.1 GCP Installation Files


  1. On your Google console, click Dataproc > Create Cluster
  2. Provide cluster Name.
  3. Select Region and Zone the same as the one selected for the Kyvos instances.
  4. Select Cluster type as Standard (1 master, N workers).


  5. Optionally, select Autoscaling policy. It is not mandatory to select an autoscaling policy while creating Dataproc. You can also add it after creating Dataproc.

    To attach an autoscaling policy to your cluster after creation, follow the steps given in the Enabling Autoscaling on cluster section
    Kyvos recommends attaching the autoscaling policy to the Dataproc cluster.

  6. Under Versioning, use the Change button to select the Image Type and Version for the nodes.
    Optionally, select any of the Kyvos supported versions.

  7. Click Configure nodes on the left, and in the Master Node area define Machine configuration as:
    Series: N2D
    Machine Type – n2-highmem-4 (4 vCPU and 32 GB)
    Provide Primary Disk type and Disk size



  8. Configure the Worker Node with Machine Configuration with Minimum Recommendation as:
    Series: N2D
    Machine Type: n2-highmem-8 (8 vCPU and 64 GB)
    Provide Primary Disk type and Disk size as 500 GB with Type Standard Persistent Disk
    Node Count – As needed
    Local SSD – 0 (default)



  9. Click Customize Cluster on the left, and define the Network Configuration as:
    NOTE: You must specify the same network details as used for the Kyvos Instances.
    1. Under Internal IP Only, select the Configure all instances to have only internal IP addresses checkbox.
    2. In the Dataproc Metastore, select the Metastore service from the list to use Dataproc Metastore as Hive metastore.


  10. Creating Initialization Actions
    1. For SSH disabled Dataproc environment, you can provide the dataproc.sh script (provided in the GCP Installation Files folder in gcp.tar) to ensure that your snapshot bundles are uploaded to the bucket. For this, go to Initialization actions and click the Add initialization Action button and use the Browse button to select the dataproc.sh script.
      NOTE: The dataproc.sh script must be available in your bucket.
    2. For Livy Server enabled Dataproc environment, you can provide the livyserver.sh script (provided in the GCP Installation Files folder in gcp.tar)to ensure that the Livy Server is deployed with the Dataproc cluster. For this, go to Initialization actions and click the Add initialization Action button and use the Browse button to select the livyserver.sh script.
      NOTE: The livyserver.sh script must be available in your bucket.
  11. In the Cloud Storage staging bucket area, select the Cloud Storage bucket that you want to use.
  12. Click the Create button on the left.
  13. Once the cluster is created, stop the Dataproc VM Instances and click the instances > Edit. Scroll to the SSH Keys section and provide the same public key which was used for the Kyvos instances. NOTE: The Service Account attached to Dataproc must have a Dataproc Worker role assigned to it.

  14. Connect to the Master node and create Kyvos directories on HDFS using the following commands.

    Code Block
    hadoop fs -mkdir -p /user/kyvos/temp
    hadoop fs -chmod -R 777 /user/kyvos


...