Cross-Account Glue and Iceberg access

Cross-Account Glue and Iceberg access

Applies to: Kyvos Enterprise  Kyvos Cloud (SaaS on AWS) Kyvos AWS Marketplace

Kyvos Azure Marketplace   Kyvos GCP Marketplace Kyvos Single Node Installation (Kyvos SNI)


For cross-account AWS Glue access

To access the AWS Glue account, perform the following steps.

  1. Log into your AWS Glue account using admin credentials.

  1. Give access to the Kyvos (BI server) IAM role and EMR role to the database in the Glue account. For this, attach the following resource policy to the Data Catalog Settings on the AWS Glue Console in the Account where your Glue tables are present.

{ "Version" : "2012-10-17", "Statement" : [ { "Effect" : "Allow", "Principal" : { "AWS" : "arn:aws:iam:: Account_Kyvos:role/<Kyvos IAM role>,arn:aws:iam:: Account_Kyvos:role/<EMR IAM role>" }, "Action" : "glue:Get*", "Resource" : [ "arn:aws:glue:us-east-1: Account_Glue:database/<your database name or * for all databases>, "arn:aws:glue:us-east-1: Account_Glue:table/<your table name or * for all tables>" , "arn:aws:glue:us-east-1:Account_Glue:database/default", "arn:aws:glue:us-east-1:Account_Glue:database/global_temp", "arn:aws:glue:us-east-1:Account_Glue:catalog" ] } ] }
  1. Add the following bucket policy for the destination bucket (Account_Glue) in Glue Account from which you grant access to the Kyvos (BI server) IAM role and EMR role.

{ "Sid": "AddCannedAcl", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam:: Account_Kyvos:role/<Kyvos IAM role>,arn:aws:iam:: Account_Kyvos:role/<EMR IAM role>" }, "Action": [ "s3:GetObject", "s3:GetObjectAcl", "s3:ListBucket" ], "Resource": [ "arn:aws:s3::: Account_Data_Bucket ", "arn:aws:s3::: Account_Data_Bucket/*" ]

For cross-account AWS Iceberg tables access

To access the Iceberg tables, perform the following steps.

  1. Go to the EMR Console.

  1. Clone your existing EMR cluster → Go to advanced options.

  1. Scroll to Edit software settings and add the following configuration.

{     "Classification": "spark-defaults",     "Properties": {       "spark.jars": "/usr/share/aws/iceberg/lib/iceberg-spark3-runtime.jar",       "spark.sql.catalog.<catalog name>": "org.apache.iceberg.spark.SparkCatalog",       "spark.sql.catalog.<catalog name>.catalog-impl": "org.apache.iceberg.aws.glue.GlueCatalog",       "spark.sql.catalog.<catalog name>.glue.id": "<glue account id>", ---account id where tables are present       "spark.sql.catalog.<catalog name>.io-impl": "org.apache.iceberg.aws.s3.S3FileIO",       "spark.sql.catalog.<catalog name>.warehouse": "s3://my-bucket/warehouse",   --- Set the warehouse directory where table metadata is stored       "spark.sql.defaultCatalog": "<catalog name>",       "spark.sql.extensions": "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions"     }   },   {     "Classification": "iceberg-defaults",     "Properties": {       "iceberg.enabled": "true"     }   }
  1. Now log into the Kyvos portal and navigate to Connections à Datalake and add the following property.
    "spark.sql.catalog.<catalog name>.glue.id": "<glue account id>"
    Here, the Glue ID should be the Account ID where the tables are present.

Related content

Copyright Kyvos, Inc. All rights reserved.