/
Common Prerequisites for Dataproc and Kubernetes

Common Prerequisites for Dataproc and Kubernetes

  • You need a valid Google Cloud Platform account. This account will be used to authenticate Terraform to interact with GCP resources.

  • The following permissions must be given to the logged-in user account:

    • Editor

    • Secret Manager Admin

    • Storage Object Admin

    • Create a custom role and assign the below permission to the role. Ensure that custom role must be attached to logged-in user account.

      • iam.roles.create  

      • iam.serviceAccounts.setIamPolicy

      • resourcemanager.projects.setIamPolicy  

  • Kyvos needs a service account to launch the Kyvos instance. Refer to the steps given in the Service Account section to create it.

  • Ensure that the following ports are opened/allowed in the Firewall inbound rules for all internal communication between Kyvos instances.
    2121, 2181, 2888, 3888, 4000, 6602, 6903, 6703, 45450, 45460, 45461, 45462, 45463, 45464, 45465, 6603, 6702, 6803, 7003, 45440, 6605, 45421, 45564, 4000, 8080, 8081, 8005, 8009, 8443, 8444, 9443, 22 and 9444.

  • Ports 22, 8080, and 8081 should be accessible from outside of the cluster from where you want to access the Web application.

  • Kyvos will need the Storage Legacy Bucket Owner role on the storage bucket to store data (semantic models). Grant this role if using existing bucket in Kyvos deployment

  • To access the storage bucket from the Kyvos instances, a NAT Gateway in VPC or Endpoint between storage and VPC should be available.  

  • To send requests to your VPC network and receive the corresponding responses without using the public internet, you must use the Serverless VPC Access connector.
    Serverless VPC Access uses the Serverless VPC Access Service Agent service account. This service account's email address has the following form:
    mailto:service-PROJECT_NUMBER@gcp-sa-vpcaccess.iam.gserviceaccount.com

    Permissions required

    1. By default, the above service account has the Serverless VPC Access Service Agent role (roles/vpcaccess.serviceAgent). Serverless VPC Access operations may fail if you change this account's permissions.

    2. If using a Shared Network, grant the above service account the Serverless VPC Access Service Agent predefined role to the project where the network originally resides.

      NOTE: You can refer to the GCP documentation to create a Serverless VPC Access connector. 

  • If your bucket is in another project, then for cross-project bucket access, you must provide the following permissions on your bucket.

    • storage.object.list

    • storage.object.get

  • If using shared VPC, the VPC must be shared with the project that you want to access.

    1. Navigate to the VPC network.

    2. Click the Shared VPC.

    3. Go to the ATTACHED PROJECTS tab and attach the project.

Note

  • This should be performed from the project where the shared VPC network originally resides.

  • The gcloud compute instances remove-metadata command that can be used to remove instance metadata entries. 

  • When using an existing VPC for Kyvos, the subnet must have a minimum mask range of /22

  • Click Roles > Create new role. Provide a name like Kyvos-role for storage service and assign the following permissions. This role should be attached to Kyvos service account.

  • deploymentmanager.deployments.list

  • deploymentmanager.resources.list

  • deploymentmanager.manifests.list

  • cloudfunctions.functions.get

  • dataproc.clusters.list

  • dataproc.clusters.get

  • compute.disks.setLabels

  • compute.instances.start

  • compute.instances.stop

  • compute.instances.list

  • compute.instances.setLabels

  • storage.buckets.get

  • storage.buckets.list

  • storage.objects.create

  • storage.objects.delete

  • storage.buckets.update

  • compute.disks.get

  • compute.instances.get

  • dataproc.clusters.update

  • storage.objects.get

  • storage.objects.list

  • storage.objects.update

  • cloudfunctions.functions.update

  • compute.subnetworks.get

  • resourcemanager.projects.getIamPolicy

  • compute.firewalls.list

  • iam.roles.get  

  • compute.machineTypes.get  

  • compute.machineTypes.list  

  • compute.instances.setMachineType

  • compute.instances.setMetadata

  • Add the below predefined roles in service account used by Kyvos cluster.

    • BigQuery data viewer

    • BigQuery user

    • Dataproc Worker

    • Cloud Functions Admin

    • Cloud Scheduler Admin

    • Cloud Scheduler Service Agent

    • Service Account User

    • Logs Writer

  • Permissions for Cross-Project Datasets Access with BigQuery:

    1. Use the same service account that is being used by Kyvos VMs.

    2. Give the following roles to the above-created service account on the BigQuery Project.

      • BigQuery Data Viewer

      • BigQuery User

  • Prerequisites for Cross-Project BigQuery setup and Kyvos VMs.

    1. Use the same service account that is being used by Kyvos VMs.

    2. To the service account used by Kyvos VMs, give the following roles on the BigQuery Project:

      • BigQuery Data Viewer

      • BigQuery User

  • For accessing BigQuery Views, add the following permissions to the Kyvos custom role (created above).

    • bigquery.tables.create

    • bigquery.tables.delete

    • bigquery.tables.update

    • bigquery.tables.updateData

  • Permissions to generate Temporary Views in Separate Dataset when performing the validation/preview operation from Kyvos on Google BigQuery.

    • bigquery.tables.create = permissions to create a new table  

    • bigquery.tables.updateData = to write data to a new table, overwrite a table, or append data to a table  

  • In the API and identity management section, for Cloud API access scopes, the Allow full access to all Cloud APIs permission must be set.

    image-20241226-062335.png

Additional permission required to run Node Scaling for GCP Enterprise

Apart from existing permissions mentioned in the Creating a service account from Google Cloud Console section, you must need the following permissions for GCP Enterprise:

Permissions required in GCP

  • compute.instanceGroups.get

  • compute.instances.create

  • compute.disks.create

  • compute.disks.use

  • compute.subnetworks.use

  • compute.instances.setServiceAccount

  • compute.instances.delete

  • compute.instanceGroups.update

  • compute.instances.use

  • compute.instances.detachDisk

  • compute.disks.delete

  • compute.instances.attachDisk

Conditional permission needed if using Shared Network

  • compute.subnetworks.use (on the Kyvos service account in the project where your network resides)

Related content

Copyright Kyvos, Inc. All rights reserved.