Sanity suite

Applies to: Kyvos Enterprise Kyvos Cloud (SaaS on AWS) Kyvos AWS Marketplace

Kyvos Azure Marketplace Kyvos GCP Marketplace Kyvos Single Node Installation (Kyvos SNI)

Important points

From Kyvos 2024.3 onwards, to execute the sanity suite for no-spark and Kubernetes (K8S) supported deployments, refer to the Prerequisites for no-spark deployments section.
From Kyvos 2023.5 onwards, AWS Managed Services will no longer require Kyvos user credentials to execute the sanity suite from Kyvos Manager. Previously, after each deployment or upgrade, our support team needed Kyvos user credentials to run a sanity suite and ensure system functionality. This was challenging in managed services deployment. The latest update eliminates the need for user credentials, streamlining the verification process for successful deployment/upgrades and ensuring proper system functionality.
Starting version 2023.1, Sanity Suite deploys a single folder rather than three to execute operations on all of the Entities that are deployed during its execution. Users should remove any folders with their names included in the below-mentioned list from previous executions before running Sanity Suite if present on the cluster. List of Folder that must be deleted if present: "Sanity_Cube_On_HCatalog, Sanity_Cube_On_HDFS, Sanity_Cube_ON_MultiFact, Sanity_DRD_ON_HCatalog, Sanity_DRD_ON_HDFS, Sanity_DRD_ON_MultiFact, Sanity_RF_ON_HCatalog, Sanity_RF_ON_HDFS, Sanity_RF_ON_MultiFact, Sanity_RF_ON_SQL"
From Kyvos 2023.2 onwards, at the start of Sanity Suite, a 'NOW' type schedule (set as default) is set for one hour using the REST API to keep the Kyvos cluster online with full (100%) Query Engine capacity. However, you can disable the 'NOW' type schedule by modifying the 'Is Cluster Scaling Enabled' property in the sanity.properties file and setting it to false.

The sanity suite is a small set of integration tests to check the activities after receiving the software build and to ensure that all the changes introduced are working as expected.

After cluster deployment, you can run the sanity Suite to check if all the activities (like creating a dataset and semantic model processing) for browsing semantic models in the Kyvos application are working fine.

Prerequisites

Create database kyvosrepo_$clustername if the KYVOS_QUERY_ANALYZER_DATABASE is not present in the olapengine.properties file.
Create database $value of KYVOS_QUERY_ANALYZER_DATABASE if KYVOS_QUERY_ANALYZER_DATABASE is present in the olapengine.properties file.
Ensure that the Sanity_table table must exist when you are creating a database.
NOTE: If AWS Glue is enabled, the Sanity_table table is not created for AWS cluster deployment. Similarly, if Dataproc Metastore is enabled, this table is not created for GCP cluster deployment.

For MSSQL

Step 1: Create the following Database and table.

CREATE DATABASE sanitysuite;
CREATE TABLE dbo.sanity_table (
product_id INT,
product_category VARCHAR(255),
manufacturer VARCHAR(255),
product_type VARCHAR(255),
product_name VARCHAR(255),
order_priority VARCHAR(255),
sale_amount FLOAT,
order_quantity INT,
discount FLOAT,
continent VARCHAR(255),
country VARCHAR(255),
region VARCHAR(255),
state VARCHAR(255),
city VARCHAR(255),
profit FLOAT,
customer_name VARCHAR(255),
customer_segment VARCHAR(255),
shipping_cost FLOAT,
date_1 VARCHAR(255),
quarter VARCHAR(255),
month VARCHAR(255),
week VARCHAR(255),
day VARCHAR(255));

Step 2: Insert the following data in the sanity table you have created in Step 1.

BULK INSERT databaseName.tableName
FROM 'pathOfCsvFile'
WITH (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
);

Step 3: To create MSSQL Warehouse connection for sanity suite, see the Working with MSSQL warehouse connection section.

Snowflake Warehouse

Step 1: Create the following Database and table.

CREATE OR REPLACE TABLE SANITYSUITE.DBO.SANITYTABLE (
PRODUCT_ID NUMBER(38,0),
PRODUCT_CATEGORY VARCHAR(16777216),
MANUFACTURER VARCHAR(16777216),
PRODUCT_TYPE VARCHAR(16777216),
PRODUCT_NAME VARCHAR(16777216),
ORDER_PRIORITY VARCHAR(16777216),
SALE_AMOUNT FLOAT,
ORDER_QUANTITY NUMBER(38,0),
DISCOUNT FLOAT,
CONTINENT VARCHAR(16777216),
COUNTRY VARCHAR(16777216),
REGION VARCHAR(16777216),
STATE VARCHAR(16777216),
CITY VARCHAR(16777216),
PROFIT FLOAT,
CUSTOMER_NAME VARCHAR(16777216),
CUSTOMER_SEGMENT VARCHAR(16777216),
SHIPPING_COST FLOAT,
DATE_1 DATE,
QUARTER VARCHAR(16777216),
MONTH VARCHAR(16777216),
WEEK VARCHAR(16777216),
DAY VARCHAR(16777216)
);

Step 2: Insert the following data in the sanity table you have created in Step 1.

In the Stages tab upload, the sanity data from the file.
Download the sanity data file.
Click the Load Data button to upload the data file.
From the File format list, select CSV.
On header, select skip first line.
Select Field delimiter as Vertical Bar.
Load the data by clicking the Load Button.

Step 3: To create Snowflake Warehouse connection for sanity suite, see the Working with Snowflake warehouse connection section.

Azure SQL Warehouse

Step 1: Create the following Database and table.

CREATE DATABASE sanitysuite;
CREATE TABLE dbo.sanity_table (
product_id INT,
product_category VARCHAR(255),
manufacturer VARCHAR(255),
product_type VARCHAR(255),
product_name VARCHAR(255),
order_priority VARCHAR(255),
sale_amount FLOAT,
order_quantity INT,
discount FLOAT,
continent VARCHAR(255),
country VARCHAR(255),
region VARCHAR(255),
state VARCHAR(255),
city VARCHAR(255),
profit FLOAT,
customer_name VARCHAR(255),
customer_segment VARCHAR(255),
shipping_cost FLOAT,
date_1 VARCHAR(255),
quarter VARCHAR(255),
month VARCHAR(255),
week VARCHAR(255),
day VARCHAR(255));

Step 2: Insert the following data in the sanity table you have created in Step 1.

BULK INSERT databaseName.tableName
FROM 'pathOfCsvFile'
WITH (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
);

Step 3: To create Snowflake Warehouse connection for sanity suite, see the Working with Azure SQL warehouse connection section.

Table Syntax

Table syntax to be used in the database for cluster deployment if AWS Glue is enabled.

CREATE EXTERNAL TABLE `sanity_table`(
  `product_id` int,
  `product_category` string,
  `manufacturer` string,
  `product_type` string,
  `product_name` string,
  `order_priority` string,
  `sale_amount` double,
  `order_quantity` int,
  `discount` double,
  `continent` string,
  `country` string,
  `region` string,
  `state` string,
  `city` string,
  `profit` double,
  `customer_name` string,
  `customer_segment` string,
  `shipping_cost` double,
  `date` string,
  `quarter` string,
  `month` string,
  `week` string,
  `day` string)
PARTITIONED BY (
  `year` int)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
  's3://cftautomationdeployment/user/engine_work_cftAutomationCluster/data/Reference/parquet'
TBLPROPERTIES (
  'bucketing_version'='2',
  'transient_lastDdlTime'='1694802287')
Time taken: 0.426 seconds, Fetched: 37 row(s)CREATE EXTERNAL TABLE `sanity_table`(
  `product_id` int,
  `product_category` string,
  `manufacturer` string,
  `product_type` string,
  `product_name` string,
  `order_priority` string,
  `sale_amount` double,
  `order_quantity` int,
  `discount` double,
  `continent` string,
  `country` string,
  `region` string,
  `state` string,
  `city` string,
  `profit` double,
  `customer_name` string,
  `customer_segment` string,
  `shipping_cost` double,
  `date` string,
  `quarter` string,
  `month` string,
  `week` string,
  `day` string)
PARTITIONED BY (
  `year` int)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
  's3://cftautomationdeployment/user/engine_work_cftAutomationCluster/data/Reference/parquet'
TBLPROPERTIES (
  'bucketing_version'='2',
  'transient_lastDdlTime'='1694802287')
Time taken: 0.426 seconds, Fetched: 37 row(s)

Using Sanity Suite

After cluster deployment, you can run the Sanity Suite to check if all the activities (like creating a Datasets, Data Relationship Diagram, and Semantic model processing) for browsing semantic models in the Kyvos application are working fine.

For this, click cluster name > Utilities > Run Sanity Suite on the navigation pane.
The Run Sanity Suite page is displayed.
Click Start. The Start Sanity Suite dialog is displayed.
On the Start Sanity Suite dialog, select any one of the following:
1. System User: Use this option to start the sanity suite without user credentials.
2. Named User: Use this option to start the sanity suite with Kyvos credentials.
Click Start.

On the page, you can do the following:
- Provide your Kyvos application (Web portal) login credentials to run the sanity test.
- In the summary section, click the Download Report button to download the latest available report.
- Click the Cleanup button to clean the entities created while executing the Sanity Suite.
- Click the Load More button to run the remaining entities.