Applies to: Kyvos Enterprise Kyvos Cloud (SaaS on AWS) Kyvos AWS Marketplace
...
Panel | ||||||
---|---|---|---|---|---|---|
| ||||||
Important
|
Automated resource creation using the script
Permissions required by Google Console users:
Logged-in users should have the privilege to launch deployment in GCP Deployment Manager.
Logged-in users must have the Viewer predefined role attached.
You must create a custom role. To do this, click Roles > Create new role .
Provide a name like Kyvos-deployment-role; assign the following permissions, and then attach to the logged-in user service account.deploymentmanager.deployments.create
deploymentmanager.deployments.delete
deploymentmanager.deployments.get
deploymentmanager.deployments.list
deploymentmanager.deployments.update
deploymentmanager.manifests.get
deploymentmanager.operations.get
storage.objects.get
compute.subnetworks.use
NOTE: The above permissions are only required to launch deployment. To view the resources after deployment, the user must have permission on the relevant resources.
The GCP Deployment Manager template is deployed through the logged-in user, and the resources inside the template are created through the default service account of GCP Deployment Manager.
To create other Google Cloud resources, Deployment Manager uses the credentials of the Google APIs Service Agent to authenticate to other APIs. The Google APIs Service Agent is designed specifically to run internal Google processes on your behalf. This service account is identifiable using the email: [PROJECT_NUMBER]@cloudservices.gserviceaccount.comThe service account must have the Editor predefined role attached.
You must create a custom role. To do this, click Roles > Create new role .
Provide a name like Kyvos-deployment-role; assign the following permissions, and then attach to the service account:cloudfunctions.functions.setIamPolicy
storage.buckets.get
Compute Network User: If using a Shared Network, grant the above service account the 'Compute Network User' predefined role to the project where the network originally resides.
If a service account is created from the template, the following permissions must be assigned to the custom role (Kyvos-deployment-role).
iam.roles.create
iam.serviceAccounts.setIamPolicy
resourcemanager.projects.setIamPolicy
Manual resource creation: Google Console users should have the privilege to launch Google resources like Instances, Dataproc cluster, Google Storage, and Disks in the Project.
Dataproc Service Agent service account: Dataproc creates this service account with the Dataproc Service Agent role in a Dataproc user's Google Cloud project. This service account cannot be replaced by a user-specified service account when you create a cluster. This service agent account is used to perform Dataproc control plane operations, such as creating, updating, and deleting cluster VMs. Please refer to Dataproc Service Agent (Control Plane identity) for details.
By default, Dataproc uses the service-[project-number]@dataproc-accounts.iam.gserviceaccount.com as the service agent account. If that service account doesn't exist, Dataproc uses the Google APIs service agent account, [project-number]@cloudservices.gserviceaccount.com, for control plane operations.Permission required :
The above service account must have the Dataproc Service Agent predefined role attached
Compute Network User: If using a Shared Network, grant the above service account the 'Compute Network User' predefined role to the project where the network originally resides.
Kyvos needs a service account to launch the Kyvos instance. Refer to the steps given in the Service Account section to create it.
The logged-in user will need access to VPN, Subnet, Network Interface/Security Group, and Service Account, which will be used by Kyvos to launch compute engines, Dataproc, and Instance Group.
Ensure that the following ports are opened/allowed in the Firewall inbound rules for all internal communication between Kyvos instances.
2121, 2181, 2888, 3888, 4000, 6602, 6903, 6703, 45450, 45460, 45461, 45462, 45463, 45464, 45465, 6603, 6702, 6803, 7003, 45440, 6605, 45421, 45564, 4000, 8080, 8081, 8005, 8009, 8443, 8444, 9443, 22 and 9444.Ensure that the following ports are opened/allowed in the Firewall inbound rules for all internal communication between the Dataproc cluster and Kyvos.
3306, 8030, 8031, 8032, 8033, 8042, 8088, 9083, 8188, 18080, 8050, 8051, 8020, 10020, 19888, 10033, 8188, 9870, 10200, 10000, 10002, 22, 45460, 9866, 8998, and 9867
NOTE: The port 8998 is required for Livy. The port 8998 is also required when upgrading the Kyvos cluster to version 2023.3Ports 22, 8080, and 8081 should be accessible from outside of the cluster from where you want to access the Web application.
Create a firewall rule with all ports open between Dataproc master and worker nodes using network tags as targets, which will be attached to the Dataproc.
For more information about the required ports between the Dataproc master nodes and the worker nodes, refer to GCP documentation at: Dataproc Cluster Network ConfigurationIf the Kyvos instances and Dataproc clusters are launched in a different VPN/Subnet, then Network Peering should be created between both networks.
There should be a private and public key for creating the Kyvos instances and the Dataproc cluster.
Kyvos will need the Storage Legacy Bucket Owner role on the storage bucket to store data (semantic models).
To access the storage bucket from the Kyvos instances, a NAT Gateway in VPC or Endpoint between storage and VPC should be available.
To send requests to your VPC network and receive the corresponding responses without using the public internet, you must use the Serverless VPC Access connector.
Serverless VPC Access uses the Serverless VPC Access Service Agent service account. This service account's email address has the following form:service-PROJECT_NUMBER@gcp-sa-vpcaccess.iam.gserviceaccount.com
Permissions required:
By default, the above service account has the Serverless VPC Access Service Agent role (
roles/vpcaccess.serviceAgent
). Serverless VPC Access operations may fail if you change this account's permissions.If using a Shared Network, grant the above service account the Serverless VPC Access Service Agent predefined role to the project where the network originally resides.
NOTE: You can refer to the GCP documentation to create a Serverless VPC Access connector.
Create an Autoscaling policy using Kyvos recommended configuration for Dataproc.
Private Google Access must be enabled for the subnet that you will use for deploying Kyvos and Dataproc clusters.
To enable external Hive metastore, the role attached to the Kyvos Manager node must have the following permissions:
resourcemanager.projects.list
dataproc.clusters.get
compute.instances.get
If your bucket is in another project, then for cross-project bucket access, you must provide the following permissions on your bucket.
storage.object.list
storage.object.get
For cross-project metastore and Dataproc, assign the following roles on the project having metastore. Refer to the GCP documentation for details.
Dataproc Service Agent
Dataproc Metastore Service Agent
Ensure that the Kyvos deployment and the Dataproc cluster for use with Kyvos run in the same Project and Region.
Kyvos recommend instance configuration:
Machine type for Kyvos Manager, Query Engine, and BI Server:
Kyvos Manager: n2-standard-4
Query Engine: n2-highmem-4
BI Server: n2-standard-8Master and worker nodes of Dataproc cluster
Master Node:
Series: N2
Machine Type – n2-highmem-4 (4 vCPU and 32 GB)Worker Node:
Series: N2
Machine Type: n2-highmem-8 (8 vCPU and 64 GB)
If the Dataproc cluster is in a different region, then under compute metadata VmDnsSetting, set the value as GlobalDefault.
For a non-SSH based cluster, If you use an existing Dataproc cluster and an existing bucket, you must execute the dataproc.sh script (available in the GCP Installation Files folder) on the master node of Dataproc after changing the values of DEPLOYMENT_BUCKET, WORK_DIR, COPY_LIB, and DATAPROC_VERSION to the name of the existing bucket.
To store repository credentials and other confidential credentials on the Secret Manager, you need to create a Secret.
To deploy the Kyvos cluster using password-based authentication for service nodes, ensure that the permissions listed here are available on all the VM instances for Linux user deploying the cluster.
To deploy the Kyvos cluster using custom hostnames for resources, ensure that the steps listed hereare completed on the resources created for use in the Kyvos cluster.
If using shared VPC, the VPC must be shared with the project that you want to access.
Navigate to the VPC network.
Click the Shared VPC.
Go to the ATTACHED PROJECTS tab and attach the project.
NOTE: This should be performed from the project where the shared VPC network originally resides.
...