Document toolboxDocument toolbox

Working with data connections

Applies to: Kyvos Enterprise  Kyvos Cloud (SaaS on AWS) Kyvos AWS Marketplace

Kyvos Azure Marketplace   Kyvos GCP Marketplace Kyvos Single Node Installation (Kyvos SNI)


Data connections are used to connect to Kyvos cluster data, for example, to do computations and also to connect to your data.

Kyvos supports these types of data connections:

  • Computing connections such as Hadoop, Local Process Connection (for AWS), or Dataproc (for GCP). This represents a physical computation cluster used for submitting jobs to process semantic models.

  • Data warehouse connections such as Snowflake, Teradata, or Redshift.  or Redshift, or BigQuery.

  • Raw data querying connections such as Presto, Spark, and Hive. You can also enable Snowflake and Athena for raw data querying. 

  • Repository for metadata such as Postgres SQL  or Amazon RDS for Postgres.

You can view details about a connection  from Kyvos to a data  source or a connection to a repository for storing metadata. Some connections have an option to test the connection and show the status such as valid.

If your permissions allow you can add connections or modify settings. For example, i f your instance of Kyvos has been configured via the portal.properties file to support it, you can select Presto to create a register file with a Presto connection.

Note

Hive connections do not have a Properties link.

To view information about a data connection, perform the following steps. 

  1. From the Toolbox, click Setup, then choose Connections.

  2. Click a connection from the Connections column.

  3. Click Properties to view existing properties or add new ones.


    The Edit Properties dialog box shows a list of properties available for the connection. You can view and modify properties from here.

To add a data connection, perform the following steps. 

  1. From the Toolbox, click Setup, then choose Connections.

  2. From the Actions menu (  ⋮  ), click Add Connection.

  3. Enter a Name for the connection.

  4. Choose the Category from the drop-down list.

  5. Select the Provider from the drop-down list.
    The other options change based on your selection.

  6. See the Provider parameters table for details.

  7. To test the connection, click the Test button at the top of the dialog box. If the status is Invalid, click Invalid to learn more. 

  8.  Click Save when you are finished. 

To modify a data connection, perform the following steps. 

  1. From the Toolbox, click Setup, then Connections.

  2. Select a connection from the Connection List.

  3. Change the Category by selecting an option from the drop-down list.

  4. Change the Provider by selecting an option from the drop-down list.
    The other options change based on your selection.

  5. Make changes as desired. See the Provider parameters table for details.

  6. For a Hadoop connection, click Properties to view or edit existing properties.

    1. Click Search to quickly find a specific property.

    2. Use the filter options to view a subset of the properties.
      For example, click Hadoop > Hive to view those properties.

  7. If needed, click Add Property to add a property.

    1. Enter the property name and value.

    2. Then click Add.

  8. Click Apply when you are finished. 

To set up or view a Presto connection, perform the following steps.

  1. From the Toolbox, click Setup, then Connections.

  2. Select Presto from the Connection List.

  3. If you are setting it up, enter provider details. See the Provider parameters table for details.

To share a data connection, perform the following steps: 

  • Kyvos allows you to share and grant access to data connections. 

  • To allow users or groups to access a data connection and view connection information, refer to Sharing an object. 

Separate read and process connections

On non-SSH and Livy-enabled Azure and GCP clusters, Kyvos allows you to create separate process and read connections for launching process jobs and reading data.

Kyvos requires a process cluster to launch process jobs and create aggregates for which a single process can be configured in Kyvos during deployment. With this release, users can create a separate read connection for holding the data on which the semantic model is to be created, such as Snowflake, BigQuery, and so on.

Users need to create a base process connection during the deployment, and subsequently add a new process connection through the Kyvos web portal. The new connection must have the same configuration in terms of cluster version, Hadoop, Hive, and Spark versions.

When you define a new connection on non-SSH and Livy-enabled Azure and GCP clusters, you will see the option to choose if it is a Process connection, as shown in the following figure.

By default, all Warehouse connections are Read connections, as they can only be used to read data for registering files.

Refer to the Multiple processes and read connections section to know more.

Raw data queries

If a Snowflake connection type supports raw data querying, you can enable it by enabling the  Is Default SQL Engine checkbox.  See the Provider parameters table for details.

To use Spark or Hive for raw data querying , the Hadoop connection must have the Is Default SQL Engine checkbox disabled for other raw data connections (e.g Presto and Snowflake)

For a Spark connection,  you need to provide the Spark thrift server connection string URL.
For example: jdbc:hive2://10.260.431.111:10001/default;transportMode=http;httpPath=cliservice;principal=hive/intelli-i0056.kyvostest.com@KYVOSTEST.COM

Copyright Kyvos, Inc. All rights reserved.