Load Step
Load step under Query Object loads the incoming data (after extraction and transformation) into a target system (File, Relational database, HDFS, No SQL / Columnar system etc.).
You need to select a target Intellicus connection (already created under Configure > Databases) on which data should be loaded. One of the important configurations is the deselecting of the ‘Read Only’ setting of the target system. If not de-selected, this will not show in the list of target system where the extracted data can be loaded.
There are target connection-specific properties required to load data into the system.
On Opening the Target Connection, you will see the option “Default.” When you select the default connection, it will correspond to the default connection that the user selected on the configure page. In such case, the Load function will work for the default connection.
You can fetch list of existing Tables/Files from the connection.
Loading into File/RDBMS
Figure 1: Loading into RDBMS
The properties of Load step (in case target system is a file system) are:
PropertyValuesCommentsTarget ConnectionSelect from listSelect a target connection on which data should be loadedTable/File Properties
Target Table/File NameSelect from listSelect target table/file name using the selected Target Connection.
Type in the table/file name if it does not exist. The table/file gets created under the selected target connection
If Table/File ExistsSelect from list:
Append Data
Delete Existing Data
Skip Load
Drop and Create New Table/Delete and Create New File
Truncate Existing Data
Action to be performed in case table/file exists:
To append data in the end
To delete data in existing table
To not perform any action on table/file and skip load step
To drop the table and create new table /file with same name and new schema.
To delete data in existing table, but Admin user can roll back this data
EncodingSelect from listSelect the file encoding typeCompressionSelect from listSelect the compression type of network dataSuffix TimeStampCheck/UncheckCheck= suffix timestamp in the chosen format to table/file
Select from the available list
Data PropertiesIf Record ExistsSelect from list:
Insert Only
Update Only
Update or Insert
Action to be performed in case record exists:
Insert the new record
Update the existing record with the new
In case of an existing record, update it with the new record
Else insert the new record
Ignore Empty RowsCheck/UncheckCheck= ignores empty rows in table while loadingRow PropertiesLine SeparatorType YourselfSpecify the separator that is used to separate rows in the fileSplit After N RowsType YourselfSpecify the maximum number of rows to split the fileInclude HeaderCheck/UncheckCheck to include headers specified in the fileIgnore Empty RowsCheck/UncheckCheck to ignore empty rows specified in the fileColumn PropertiesFetch ColumnsClick the optionFetches updated columns of the existing table
You can see the field details of the data to be loaded under Column Properties section
Remove All ColumnsClick the optionRemoves all columns in the target tableEnclosureType YourselfSpecify the enclosure if the file contains enclosure within the fieldsSeparatorType YourselfSpecify the field separator for the fileError HandlingError ThresholdType YourselfSpecify count of error after which loading process should stop.
Any positive number>0 means stop processing after that many +1 errors i.e. if its value is 2, stop processing upon 3rd error.
Loading into NOSQL (HBASE)
The properties of Load step (in case target system is an HBASE system) are:
PropertyValuesCommentsTarget ConnectionSelect from listSelect a target connection on which data should be loadedTable PropertiesTarget Table NameSelect from listSelect target table name using the selected Target ConnectionIf Table ExistsSelect from list:
Append Data
Skip Load
Drop and Create New Table
Action to be performed in case table exists:
To append data in the end
To not perform any action on table and skip load step
To drop the table and create new table with same name and new schema
Suffix TimeStampCheck/UncheckCheck= suffix timestamp in the chosen format to table.
Select from the available list
Row PropertiesRow KeySelect from list of available fieldsUnique identifier/field for each rowDataTypeSelect from listData Type of the Row KeyError HandlingError ThresholdType YourselfSpecify count of error after which loading process should stop.
Any positive number>0 means stop processing after that many +1 errors i.e. if its value is 2, stop processing upon 3rd error
Column PropertiesFetch Column FamiliesClick the optionFetches updated column families of the existing table.
You can see the field details of the data to be loaded under Column Properties section
Remove All ColumnsClick the optionRemoves all columns in the target tableLoading into HDFS
The properties of Load step (in case target system is HDFS) are:
PropertyValuesCommentsTarget ConnectionSelect from listSelect a target connection on which data should be loadedFile Properties
Target File NameSelect from listSelect target file name using the selected Target Connection.
You should have Read/Write/Execute access permissions on the target directory to select target file from
If File ExistsSelect from list:
Append Data
Skip Load
Delete and Create New File
Action to be performed in case file exists:
To append data in the end
To not perform any action on file and skip load step
To delete existing file and create a new file with the same name
EncodingSelect from listOption to encode the file in available formats (ASCII, ISO-8859-1, UTF-8)CompressionSelect from listOption to create the file in compressed format (zip & gzip)Suffix TimeStampCheck/UncheckCheck= suffix timestamp in the chosen format to file.
Select from the available list
Row PropertiesLine SeparatorType YourselfSpecify a separator to distinguish rows/recordsSplit After N RowsType YourselfOption to split the output into multiple files.
Splitting will be performed on the basis of number of records. Blank/0 means no splitting. Any positive number means split the output after that many records
Include HeaderCheck/UncheckCheck= includes header from file/table.
This option will be applicable only for new file
Ignore Empty RowsCheck/UncheckCheck= ignores empty rows in table while loadingColumn PropertiesEnclosureType YourselfSpecify enclosure for column valuesSeparatorType YourselfSpecify the column separatorError HandlingError ThresholdType YourselfSpecify count of error after which loading process should stop.
Any positive number>0 means stop processing after that many +1 errors i.e. if its value is 2, stop processing upon 3rd error
You can run and schedule Query Objects which are of Load type from under Explorer.
Running a Query Object will execute the transformation and load the data into target system. You can also schedule the query object in once or recurring mode.
Load step can also be followed by the Data Source step further followed by Format step to be used for generating reports.
Note: Every time the Load step fails, a CSV file (<TargetTableName>_failed_records.csv) gets created with the failed records at <intellicus installation path>/Intellicus/reportengine/bin
The location of CSV file is governed by report server property:”LOAD_STEP_FAILS_DUMP” in ReportEngine_Default.properties (under <intellicus installation path>/Intellicus/reportengine/config) .
Related content
Copyright Kyvos, Inc. All rights reserved.