Using MongoDB Atlas SQL
Overview
MongoDB Atlas provides an SQL oriented interface as part of its DBaaS Atlas offering. The Atlas SQL Interface only supports read operations which is appropriate anyway for Qarbine analysis. For more information see the page at https://www.mongodb.com/docs/atlas/data-federation/query/query-with-sql/.
Data Source Integration
An important distinction of Qarbine vs. legacy SQL-centric tools is that the data returned to Qarbine from the SQL queries does not have to be flat and homogenous.
Start by openig a Data Source Designer and selecting a MongoDB Atlas data service as shown below.
View the available collections using the left hand drop down.
Below is a basic query.
select * from bios
The results are shown below.
Note that, unlike with legacy SQL tools, the MongoDB document model characteristics are maintained in Qarbine. There is no manual unflattening or other data gymnastics to use the data in its natural business object model structure. In legacy SQL tools the documents would be just simple JSON strings for the developer to manually deal with. Selecting one of the rows
shows its details on the right hand side. Notice the name embedded document and the contribs and awards embedded arrays as just as they are stored in MongoDB.
We can save this query as a Data Source in the catalog. It can be stored in our private folder or in a folder that may be visible to others. The Qarbine administrator sets up folders, their permissions and their visibility. The catalog promotes organization and the managing of many components across individuals and teams.
Qarbine provides a set of query ‘pragmas’ to further manipulate answers sets. More details can be found in the Data Source Designer document’s “Using Pragmas” section. In the JSON above notice the birth and death values are just strings. In our analysis we would like real Date objects though. This can be easily accomplished using the query text below.
#pragma convertToDate birth, death
select * from bios
Our sample row now has real dates to interact with whether for better presentation or perhaps calculating the life span.
Sample Template
With Qarbine we can easily create a report using the data source query results above complete with the embedded document and multiple embedded arrays. Most other approaches would require manually unflattening and painfully rebuilding the perfectly good MongoDB documents as they were originally stored. Below is an example snippet of the analysis result.
The example is found in the Qarbine catalog folder “example/MongoDB/Atlas SQL” and named “Bios sorted by last name using SQL”. The data source used has the same name. When you open that template the Template Designer provides information on the right hand side structural information about the main data via a drop down.
The output is shown below.
When templates have multiple groups and are iterating over several different arrays the following drop down option is very useful.
The output is shown below.
The main data retrieval uses the data source to fetch an answer set using SQL. As each answer set row is fetched it is assigned to the biosElement variable. Group 1.1 iterates over each main document’s awards list. It obtains the list by extracting the awards field value from the current biosElement variable. Likewise, group 1.2 then iterates over each main document’s contribs list.
Let’s look at the template structure. The page header and report header are shown below.
The “=pageXofY” is a special variable that gets adjusted at the conclusion of the report. Page headers and footers are optional and you can choose to skip the notion of pages altogether as well depending on your requirements. We bolded and increased the font height for the title of the report.
The first group header is shown below.
It pulls out the name information from the embedded ‘name’ document and uses the concat() function to append the first and last names. Qarbine includes over 450 built-in macro functions to manipulate and interact with data. They have the look and feel of Microsoft Excel macros.
The “Awards” cell is used as a heading for the individual awards elements. The birth and death dates have been formatted using a smart date template.
Each award’s year, award, and by values are extracted by the 2 body lines. The ‘#’ prefix is a shortcut used to indicate a field of the current element.
In the summary we included a count() macro to display how many awards there are. Qarbine includes other Excel-like aggregation functions including sun, avg,max, and min.
Group 1.2 handles iterating over the embedded contributions array and also includes a count() on its summary.
The overall template design is shown below.