Skip to main content

Data Source Designer Guide

Overview

The Data Source Designer provides a free form query interface for those skills with authoring database queries. It complements other popular query tools such as MongoDB Compass by providing a centralized catalog to maintain and share queries. The Qarbine Catalog also supports folder visibility and component permissions for even better team collaboration. The overall execution flow for an analysis, including the optional prompt component, is shown below.

  

Component Examples

The standard Qarbine installation includes many types of example components including data sources. Each Data Source references a Data Service which is defined by the Qarbine administrator. The examples may refer to yet undefined data services. An example of running an AWS DynamoDB example may show.

  

This likely indicates that there is no Qarbine Data Service yet defined by the Qarbine administrator named “AWS DynamoDB”. To run this example component, such a data service must first be defined by the Qarbine administrator. Refer to the associated Qarbine configuration and tutorial documentation for the given data driver for more information. Qarbine has preset access to a set of commonly used sample MongoDB data through the “Sample Data Service” definition.

For some data drivers there is a sample data file and optionally sample code to load that data provided. For example, the Qarbine Administrator can SSH into the Qarbine host and review the contents of the ˜/qarbine.service/sample folder. The Qarbine documentation for the driver will refer to any support sample data and data loader.

Prerequisites

Data service configured to access your data. In this example we will access the MongoDB sample store sales data.

Opening the Tool

You can access the Data Source Designer in several ways. Tools can be opened from the signOn page, the home page, and from various tools using the hamburger    menu.


From the Home tool.

  

Alternatively you can open the tool from the hamburger    menu on each tool.

  

. . .

  

Pressing the control key during the menu click opens the tool in another tab.

The layout of this tool is depicted below. .
  

Specifying the Data to Query

Select your data service.

  

The databases associated with that data service will populate the database dropdown.

  

Select the database “sample_airbnb” database as shown below.

  

On the left hand side the collections drop down will be populated. This area is informational in nature and independent of what you type into the query text field. Select a collection from the drop down.

  

Its general structure will be shown.

  

The large text area is where you enter your query. Here is an example MongoDB query.

db.listingsAndReviews.find( { "address.country_code": 'AU'} ).

project( {name: 1, summary: 1, address: 1} )

Note there is a dot before the “project” clause.

  

Clicking the run icon    performs the query and the results are shown in the bottom area.
  

Selecting one of the rows populates the right hand area as shown below.

  

Date Display and Maximum Element Retrieval Options

Note that the sample data above has sale dates stored as UTC dates. By default dates are shown in the local timezone. To change this presentation click the   button to go to the property dialog.

Check the “ISO UTC” option as shown below.

  

This dialog is also one way you can change the maximum number of elements in the answer set you want processed.

NOTE- The “Max Objects” value is an action performed by Qarbine. Specifying zero indicates no limitation on Qarbine’s part. It is best to limit the size of the result set within the query itself which will then be optimized by the target database. While you are formulating your query this may be the best strategy.

Defining Queries

Overview

The syntax of a Data Source queries are based on the underlying database or endpoint. It may range from MongoDB query language, to SQL, to even a JSON object structure. Review the documentation matching the database or endpoint of interest for specific details on query interaction and syntax.

The query text entered can include non-backend native lines of text. The simplest example is starting a line off with “//”. This indicates a comment or other text that is not sent to the backend as part of the native query. It is a convenient way to document your query independently of the data source’s description property. You can also easily enable and disable query snippets.

Data Types

The data types returned vary based on the Qarbine integration driver being used. In general numbers, strings, and booleans are returned as their primitive counterparts. Some drivers encode dates for example as objects with specific fields. Their automatic conversion into JavaScript dates is usually done by the Qarbine integration driver.

Using Pragmas to Manipulate Answer Sets

The queries to your data may be defined using multiple lines of text for formatting and other purposes. Like the ‘//’ comment lines above, there are a few recognized lines that are preprocessed out by Qarbine and not sent to the backend data.

Starts With Description
//Used to comment as noted above.
#pragmaThere are many general pragma keywords recognized as described below and a few for specific data drivers.
Pragma Keyword Description
addArraysThe first 2 arguments are fields that each have array values. The last argument is the name of the field to get them combined. Given this data.   The line “#pragma addArrays events_properties_names, events_properties_values, events” results in   
convertToDateProvide a CSV list of fields to convert from values to dates. The values can be strings or numerics to “new Date(...)”. The convert keywords accept field paths of up to 2 levels. The first level can be a document or an array of documents. This is done in-place.
convertToNumberProvide a CSV list of fields to convert from values to dates. The values can be strings or numerics to “Number(...)”. This is done in-place.
convertToObjectProvide a CSV list of fields to convert strings to JSON objects via JSON.parse(someString). This is done in-place.
convertToObjectsProvide a CSV list of ARRAY fields to convert their element strings to JSON objects via JSON.parse(someString). This is done in-place.
deleteFieldsProvide a CSV list of fields to delete. The arguments may have field paths of up to 2 levels. The first level can be a document or an array of documents. The deleting is done in-place. This is convenient when the result row/document has many fields and you want all just a few of them. Rather than explicitly list the 20 say fields of the 23, just ‘delete’ those 3 from the answer set.
mergeFieldsThere are 3 arguments for each action: key, value, and destination fields. You can have as many triplets per line as you would like and you can have several lines with this pragma. The key and value fields of the row are arrays. They are used to create a new object. That object is then saved into the target field of the result row. If the destination field is ‘row’, then the fields are assigned to the row with no intermediate field.The line #pragma mergeFields keys, values, namewith this record "keys" : ["first", "last"], "values": ["tom", "jones"]results in the objectname { first: "tom", last” : "jones"}
mergeArraysProvide a CSV list of field names with the last being the field to receive the merged field values. Each of the source fields is an array and they all have the same number of values. The line#pragma mergeArrays solist_names, soloist_role, soloist_instruments, soloistswith this record{ "solist_names": [ "Otto, Antoinette", "Timm, Henry C." ], "soloist_role": [ "S", "A" ], "soloist_instruments": [ "Soprano", "Piano" ],}results in "soloists": [ { "solist_name": "Otto, Antoinette", "soloist_role": "S", "soloist_instrument": "Soprano" }, { "solist_name": "Timm, Henry C.", "soloist_role": "A", "soloist_instrument": "Piano" } ]
runPostQueryApply either a SQL statement or a MongoDB query to the result data. Use a table name or collection name of ‘data’. The query must be defined on a single line as part of the pragma line.
sortResultByApply a SQL oriented ORDER BY operation to the answer set.#pragma sortResultBy age desc
pullFieldsUpProvide a CSV list of object fields to pull their contents up to the first level. The original field is deleted. This can be useful when there are many inner fields that can be part of the first level. Instead of several template formulas like #container.first and #container.last you can simply use #first and #last via the line#pragma pullFieldsUp container

Interacting with Query Results

Once you have a result you can then do other actions such as:

  copy it to the clipboard in JSON format.

   view the JSON in another window.

   send it in various formats to a particular destination such as a download folder.