Deploy Kyvos through Kubernetes and Query Engine
You can use no Spark model processing to deploy Kyvos through Query Engine or Kubernetes for Azure and GCP.
AWS: You can use no Spark model processing to deploy Kyvos through Query Engine.
Azure: For Kubernetes, Kyvos processes the semantic model using Azure’s managed service AKS (Azure Kubernetes Service). The Azure cluster is deployed via ARM templates. You can create a cluster without Spark or process the semantic model utilizing Spark Mode within ARM templates.
GCP: For Kubernetes, Kyvos
...
processes the semantic model using Google Cloud's managed service GKE (Google Kubernetes Engine). The GKE cluster is deployed through
...
GCP Installation Files. Using the scripts, you can
...
select a No-Spark-based cluster or process the semantic model using Spark Mode.
...
To proceed with a no-spark-based deployment mode, you must use a Dataproc cluster (either new or existing).
Prerequisites
Before deploying the Kubernetes cluster, it is recommended that you refer to the Prerequisites for deploying Kyvos in a GCP environment section for the complete set of permissions required for deploying Kyvos.
Additionally, for creating a GKE cluster, you must complete the following prerequisites.
Create a GKE cluster
Case 1: If using an existing Virtual Network, creating a GKE Cluster requires two secondary IPV4 addresses in the subnet. Additionally, if using a shared Virtual Network, following roles and permissions are required for by Default service account of Kubernetes (service-PROJECT_NUMBER@container-engine-robot.iam.gserviceaccount.com) on the project of Shared Virtual Network
Compute Network User
kubernetes_role (create a custom role)
compute.firewalls.create
compute.firewalls.delete
compute.firewalls.get
compute.firewalls.list
compute.firewalls.update
compute.networks.updatePolicy
compute.subnetworks.get
container.hostServiceAgent.use
Case2: Existing (IAM) Service account
Add the following roles to the existing IAM service account:
roles/iam.serviceAccountTokenCreator (Service Account Token Creator)
roles/container.developer (Kubernetes Engine Developer)
roles/container.clusterAdmin (Kubernetes Engine Cluster Admin)
Add the following permissions to Kyvos role:
compute.instanceGroupManagers.update
Compute.instanceGroupManagers.get
Kyvos Deployment in a GCP environment
...
gkeSubnetName
secondaryRangeName1
secondaryRangeName2
dataprocMetastoreURI
createGKE
gkeWorkerInitialNodeCount
gkeWorkerInstancetype
minWorkerNodeCount
maxWorkerNodeCount
Note- Refer to Kyvos Deployment document for other parameters, require to be updated for creating other resources and deploying Kyvos.
If using existing Service Account-
Once resources are created, execute the following commands using the gcloud CLI to link the Kubernetes Service account to the IAM Service account.
gcloud iam service-accounts add-iam-policy-binding IAM_SA_NAME@IAM_SA_PROJECT_ID.iam.gserviceaccount.com --role roles/iam.workloadIdentityUser --member "serviceAccount:PROJECT_ID.svc.id.goog[kyvos-monitoring/default]"
gcloud iam service-accounts add-iam-policy-binding IAM_SA_NAME@IAM_SA_PROJECT_ID.iam.gserviceaccount.com --role roles/iam.workloadIdentityUser --member "serviceAccount:PROJECT_ID.svc.id.goog[kyvos-compute/default]"
Replace the following:
IAM_SA_NAME: The name of your new IAM service account.
IAM_SA_PROJECT_ID: The project ID of your IAM service account.
PROJECT_ID: The project ID of your Google Cloud.
Post deployment, the following steps need to be taken:
Set property in connections: Users must add the following property from the Kyvos connections page:
...
For more details about how to deploy Kyvos on GCP using the no-Spark model, see the following section:
Post deployment steps for all clouds (AWS, Azure and GCP)
After deploying Kyvos using no-Spark processing model, perform the following post deployment steps.
Add the kyvos.connection.readUsingCustomFS.jobs.internal=NONE
...
Set properties at Cube level: Users should modify property from the Kyvos connections page
Modify the values of the following properties in the advance properties of Cube buildsemantic model job:
...
kyvos.process.compute.type=KYVOS_COMPUTE
...
kyvos.build.aggregate.type=TABULAR
Set below property on Hadoop connection properties and restart kyvos services
...
Property - kyvos.process.datastore.properties
Value - SET disabled_optimizers = 'join_order';SET memory_limit='40GB';SET threads TO 1;
...
Debugging
The GKE cluster consists of two namespaces:
a. kyvos-compute: This namespace hosts all Kyvos computation workers.
b. kyvos-monitoring: The Kyvos monitoring server, responsible for creating and scaling Kyvos computation workers, operates within the kyvos-monitoring namespace.
Kubectl commands list:
View the list of running pods across all namespaces:
kubectl get pods --all-namespaces
View the Google Kubernetes Engine (GKE) worker nodes:
kubectl get nodes
View the monitoring pods in the kyvos-monitoring namespace:
kubectl get pods -n kyvos-monitoring
View the Kyvos compute worker pods in the kyvos-compute namespace:
kubectl get pods -n kyvos-compute
View the bootup logs of Kyvos compute worker pods:
kubectl logs kyvos-compute-pod-name -n kyvos-compute -c kyvos-compute-worker
View the logs of the Kyvos monitoring pod:
kubectl logs kyvos-monitoring-server-pod-name -n kyvos-monitoring -c kyvos-monitoring-server
If the pods are not coming up and are in a pending state:
For Kyvos compute:
kubectl describe pods kyvos-compute-pod-name -n kyvos-compute
For Kyvos monitoring:
kubectl describe pods kyvos-monitoring-server-pod-name -n kyvos-monitoring
SSH into a pod:
kubectl exec -it kyvos-compute-pod-name -n kyvos-compute -c kyvos-compute-worker bash
Replace:
`kyvos-compute-pod-name` with the actual name of the pod obtained using `kubectl get pods -n kyvos-compute`.
...
Restart Kyvos services.
Post deployment steps for Azure
The post-deployment steps must be executed since the Data Lake connection is blank on the Azure no Spark cluster.
SSH Kyvos Manager from terminal.
Navigate to /data/kyvos/app/kyvos/olapengine/conf/
In the providerInfo.xml file, add Hadoop cluster for Azure provider list, including all necessary details.
Code Block |
---|
<DEPLOYMENT_PROVIDER_MAPPING PROVIDERLIST="5">
<TYPE NAME="AZURE"></TYPE>
</DEPLOYMENT_PROVIDER_MAPPING>
<PROVIDERLIST NAME="5">
<ALLOWED_PROVIDERS><![CDATA[HADOOP_CLUSTER,LOCAL_PROCESS_CLUSTER,DATABRICKS,PRESTO,ATHENA,POSTGRESQL,AZURESQLDB,GENERIC,SNOWFLAKE]]></ALLOWED_PROVIDERS>
</PROVIDERLIST> |
Update the snapshot Bundle for Kyvos BI server configurations.
Restart the services.
Important points to know
To process semantic model without spark, you must do the following:
From the Kyvos Connection page, do the following:
For ROLAP queries, set the kyvos.connection.defaultsqlengine property’s value as True.
Add the kyvos.connection.isRead property and set is value as True.
Modify the value of the following semantic model advanced properties on Kyvos Web Portal:
kyvos.sqlwarehouse.catalog = ‘Your catalog name’
kyvos.sqlwarehouse.tempdb = ‘Your Database name’
kyvos.build.aggregate.type = TABULAR
Kyvos.process.compute.type = Kyvos Compute
You can also set subtype to select no Spark model processing for semantic model via property.
To do this, navigate to the Kyvos Properties page and update the KYVOS_PROCESS_COMPUTE_SUBTYPE property and restart Kyvos services.