Kyvos on Azure - SQL Queries from Python Code in Databricks Notebook
Applies to: Kyvos Enterprise Kyvos Cloud (SaaS on AWS) Kyvos AWS Marketplace
Kyvos Azure Marketplace  Kyvos GCP Marketplace Kyvos Single Node Installation (Kyvos SNI)
This section explains the steps and procedures required to install and configure Azure Databricks machine, connect to Kyvos semantic model using Python and execute SQL queries on Kyvos semantic model. Although these steps are performed on Databricks, you can use the same steps on any Linux machine running Ubuntu OS.
Notes
The steps given in this section are for the Ubuntu machine and Debian ODBC package. However, for other flavors of Linux, the package would change to rpm.
These steps are executed in Python notebook on Databricks and all the progress is lost when the Databricks cluster is terminated. To automate these steps, you can write init scripts which are shell scripts that run at the startup of each cluster.
Step 1: Install pyodbc
1. Create a Databricks Notebook in Python and run the following command to install the pyodbc library.
%sh
apt-get update
sudo apt-get install python3-pip -y
pip3 install --upgrade pyodbc
2. Execute the following command from Databricks notebook, to verify if the library has been installed successfully. Select the Language as python for the notebook.
import pyodbc
Step 2: Install unixODBC and other dependencies
%sh
apt-get -y install unixodbc-dev
sudo apt install libsasl2-modules-gssapi-mit
Step 3: Download and Install Simba Spark ODBC
1. Download the Simba_Spark_ODBC_Debian.zip file using the following command.
2. Unzip and install the package.
Note
The steps listed here are recommended in Simba spark connector documentation. If you installed the driver in a different directory specify the path accordingly.
Step 4: Configure Environment Variable
1. Specify the locations of the Driver Configuration files.
2. Specify the path for the ODBC Driver Managers library.
Step 5: Configure DSN
1. Configure the DSN parameters in the ODBC.ini file.
  a. View the template that comes with the connector, using the following command.
   b. Overwrite the ODBC.ini file, using the following command.
Here,
Kyvos supports ONLY the following configurations
SparkServerType=3
Here, 3 stands for SparkThrist Server
ThriftTransport=2
Here, 2 stands for HTTP protocol
AuthMech=1 or AuthMech=3
Here, 1 stands for Kerberos, and 3 stands for Username and Password authentication mechanism.
Important
Port 8081 is used for HTTP and 443 is used for HTTPS (SSL enabled).
To configure SSL, set the value of SSL=1
If you have configured SSL, you will need to add your certificate (PEM format) to the following fileÂ
%sh echo "Your certificate content here" >> /opt/simba/spark/lib/64/cacerts.pem
2. Check the ODBC.ini and other configuration files are present in /etc/.
3.  Install the ODBC driver and provide the template.
The below command installs an instance of the ODBC driver based on the parameters provided in the template file.
Step 6: Python Code to create connection and execute SQL query
Optional Step:
Configure simba.sparkodbc.ini, overwrite LogLevel = 5 (debug), and specify a location for the LogPath to generate connection logs to debug any error in the connection.
This will create 2 log files in the /ODBC_logs/Â folder:
log – created one time for each instance of the driver installed
log – created for each connection
Copyright Kyvos, Inc. All rights reserved.