Applies to:Kyvos Enterprise Kyvos Azure Marketplace
Kyvos AWS Marketplace Kyvos Free (
As Kyvos allows on-demand EMR, you can configure Spot instances for use with EMR to optimize resource utilization at the time of semantic model processing. Running EMR on Spot Instances drastically reduces the cost, allows for significantly higher semantic model process capacity.
You can configure spot instances using a pre-created AWS CloudFormation template or manually from the AWS Console, as explained in the sections below.
AWS provides Spot instances at lower prices which Kyvos can leverage for saving semantic model process costs, but spot instances are forcibly retracted because of insufficient capacity. This is quite common for using spot instances and may lead to reattempting a failed task that was running on a retracted spot node which can increase semantic model process time.
You can use the same configuration steps for shared EMR also.
Configuring Spot instances using AWS CloudFormation Template
To configure Spot instances using the CloudFormation template, perform the following steps
- Download the emr_spot.json file from the AWS Installation Files folder.
- Log in to your AWS CloudFormation Console.
- From the top-right, click Create Stack > With new resources.
- Click Upload a template file and select the downloaded emr-spot.json file.
- Click Next.
Here, enter the details as:
Parameter Description Stack Name
Provide a stack name of your choice.
VPC
Select the VPC in which EMR instances will be launched.
Subnet
Select the subnet which is attached with Kyvos Instances.
Security group
Select the Security group to be attached to EMR.
Note: Select the same security group which is attached to EC2 instances for the BI Server or Query Engine of the cluster.
Key Pair
Select the name of the Key Pair to be used with EMR instances.
EnableSSHFlag
Select the value as true to enable SSH to the EMR cluster.
S3 bucket
Enter the name of the S3 bucket used for storing the Kyvos semantic model.
Core EC2 Instances
Enter the number of Core EC2 Instances to be launched with EMR.
Minimum number of Core EC2 instances
Enter the minimum number of Core EC2 instances that should be kept running
Maximum number of Core EC2 instances
Enter the maximum number of Core EC2 instances that should be kept running
EMR Version
Select the version of EMR which needs to be launched
Use Graviton
Set the value as true to use Graviton Instances for EMR Cluster
Enable TLS Encryption
Select true to enable TLS Encryption for EMR Cluster
S3object ARM
Enter the S3object ARM of the TLS certificate.
- Click Next.
- Mention Tag if needed.
- Click Next.
- Click Create Stack.
- Now, navigate to the Kyvos Manager and configure the private IP address of the master node of the EMR on the EMR Configuration page.
- Enter master IP of the EMR in EMR Master Node IP/Host Name field.
Click outside the textbox. The system automatically populates the EMR configuration.
- Select Sync Configuration and click Apply.
- Submit build request to start EMR on demand.
The number of nodes and types will change as per the use cases. In the Kyvos test labs, we used 3 task groups (r5.2xlarge, r4.2xlarge, r3.2xlarge) of 10 SPOT nodes in each task group while testing various cubes.
Manually Configuring Spot instances for EMR
For this, perform the following steps:
- Log in to your AWS console.
- Go to EMR and click the cluster on which you want to configure spot instances.
- Click the Hardware tab to configure Spot instances with the task node.
- Click Add task instance group
- Provide the required details for task nodes
- Name
- EC2 Instance type: Type of node should be similar to core nodes, for example, if the core node type is r5.2xlarge then recommend configuring at least 3 task groups with the same configuration.
- Non Graviton instances-based
- r5.2xlarge
- r3.2xlarge
- r4.2xlarge
- From Kyvos 2023.2 onwards, Graviton instances-based EMR is supported.
- r6g.2xlarge
- c6g.2xlarge
- m6g.2xlarge
- Non Graviton instances-based
- Instance Count 10
- Check Request Spot with use on-demand as max price
- Do the above steps if you want to add a task group
Node Group Type | Node Type |
---|---|
Core Group | r5.2xlarge |
Task-1 (Spot) | r4.2xlarge |
Task-2 (Spot) | r5.2xlarge |
Task-3 (Spot) | r3.2xlarge |
Node Group Type | Node Type |
---|---|
Core Group | r6g.2xlarge |
Task-1 (Spot) | r6g.2xlarge |
Task-2 (Spot) | c6g.2xlarge |
Task-3 (Spot) | m6g.2xlarge |
Once you are done with the above steps, enable autoscaling in Core and Task groups.
For this, perform the following steps.
The number of nodes and types will change as per the use cases.
Example for: 35 nodes
Click the Edit button in Cluster Scaling Policy.
Click the Edit icon corresponding to each Task and Core group to create the autoscaling policy, as follows.
For CORE group
Scale-out
Rule1
Add 2 instances if YARNMemoryAvailablePercentage is less than 25 for 1 five-minute period with a cooldown of 180 seconds.
Rule2
Add 2 instances if AppsPending is greater than or equal to 1 for 1 five-minute period with a cooldown of 180 seconds.
Scale in
Terminate 4 instances if AppsRunning is less than or equal to 0 for 5 five-minute periods with a cooldown of 300 seconds
For all TASK groups
Add 3* instances if YARNMemoryAvailablePercentage is less than 25 for 1 five-minute period with a cooldown of 180 seconds.
Scale in
Terminate 9 instances if AppsRunning is less than or equal to 0 for 5 five-minute periods with a cooldown of 300 seconds.
To persist configuration changes (e.g. Adding/removing task nodes) in upcoming EMRs, click Apply from the EMR Configuration Page on Kyvos Manager.
In the case of shared EMR, follow steps 2 and 3 above.
Example for: 100 nodes
Click the Edit button in Cluster Scaling Policy.
Click the Edit icon corresponding to each Task and Core group to create the autoscaling policy, as follows.
For CORE group
Scale-out
Rule1
Add 5 instances if YARNMemoryAvailablePercentage is less than 25 for 1 five-minute period with a cooldown of 180 seconds
Rule2
Add 5 instances if AppsPending is greater than or equal to 1 for 1 five-minute period with a cooldown of 180 seconds
Scale in
Terminate 5 instances if AppsRunning is less than or equal to 0 for 5 five-minute periods with a cooldown of 300 seconds
For all TASK groups
Add 9 instances if YARNMemoryAvailablePercentage is less than 25 for 1 five-minute period with a cooldown of 180 seconds
Scale in
Terminate 29 instances if AppsRunning is less than or equal to 0 for 5 five-minute periods with a cooldown of 300 seconds
Important
AWS provides Spot instances at discounted prices, which Kyvos can leverage for saving semantic model process costs. However, the Spot instances are forcibly retracted because of insufficient capacity, which is quite common while using Spot instances. This may lead to a reattempt of the failed task, which was running on the retracted Spot node, and can increase semantic model process time.
Points to remember
- As the availability of spot instances is not guaranteed, it may impact the semantic model process SLAs, and the probability of job failure for long-running jobs is high.
- If your tasks failed multiple times due to spot node reclaim, increase the retry count using the spark.task.maxFailures property.