Skip to main content

MongoDB Full Text Search (FTS) Integration

Overview

Qarbine supports MongoDB full text search (FTS) features in a variety of ways. The standard use case for FTS is to find documents of interest and process matching document fields. Since Qarbine supports native MongoDB query language this is naturally supported. The less frequent example is the use of FTS to not only find documents of interest, but also to present the results of the underlying FTS search itself. For example, consider searching stock market reports or insurance claims for keywords. In these cases you would likely want the keywords highlighted in the search results. Legacy BI and other reporting tools are simply not built to support this basic MongoDB FTS scenario. It's kryptonite for them- even beyond the whole NoSQL JSON document model thing! In contrast, Qarbine supports this with a simple macro function!

Qarbine includes many MongoDB examples including full text search ones. Here we will do a tear down of an keyword search example which includes 3 Qarbine components:

  • a data source (how to get the data),
  • a prompt (optional dialog to prompt the user for runtime values) and
  • a template (how to traverse, evaluate, and format the data).

The interaction of these components is depicted below.

  

All Qarbine components are stored within the Qarbine catalog. The catalog is similar to Windows File Explorer in presentation but is backed by an internal Qarbine database. This database is MongoDB based and as such it leverages FTS features enabling users to find catalog components in a text searching manner on their name and description fields.

Sample FTS Report Tear Down

This example can be found at “example/MongoDB/Full text search/FTS- Fruits using keywords prompt”.

In our simple FTS example, to retrieve data the Template references a Data Source component as shown below.

  

  

A single Data Source may be referenced by any number of Template components. The example Data Source component contains the following MongoDB FTS aware query.

  

Note the use of a placeholder for the user supplied keywords rather than a hard coded query.

 [! unquote(format(@keywords, "csv") ) !] 

This is one of many Qarbine features to increase reusability and capture tribal knowledge. It also prevents redundant query sprawl which is so common in BI and reporting tools.

The keywords are obtained using a Qarbine Prompt component. Prompt components are dialogs used to obtain runtime values from the user. To maximize reuse and flexibility, a single prompt can be referenced by many other components. The template references a Prompt component as shown below.

  

The Prompt component contains 2 elements.

  

When the template is run a dialog will be presented to the user as shown below.

  

This Prompt component has a MongoDB color scheme. Clicking OK populates the ‘keyword’ variable which is referenced by the Data Source’s query placeholder snippet below.

 [! unquote(format(@keywords, "csv") ) !] 

The resulting MongoDB query is sent to the MongoDB server. Running the Data Source independently within the Data Source Designer tool results in a single answer set element shown below.

  

The full details of that can be seen by selecting the row.

  

There are a lot of nitty gritty details there! Fortunately Qarbine has macro function support to make using these details really, really easy.

A Qarbine Analysis Template is similar to an outline in structure with each section (i.e. 1.1.1 Body) consisting of one or more lines. The first body line contains 3 cells and you can think of those cells as being similar to Microsoft Excel formulas. The sample template content is illustrated below

  

The report header just has our heading text in 18 point, bold Arial. Qarbine’s formatting features are very similar to Microsoft Word and its cell layout interactions are similar to PowerPoint. Qarbine provides numerous conditional formatting features as well which is important given the data diversity supported by MongoDB.

On the first body line we display the root #type value (i.e., banana) of the answer set document along with the #score. The    cell is a percent of total cell whose output you will soon see below. We use it as a visual quality indicator. Percent of total calculations are fundamental in many types of analyses. Qarbine has a two pass reporting engine to support this and other core requirements. The second body line uses the MongoDB FTS formatHighlights() macro function. This is one of Qarbine’s over 450 macro language functions which are similar to Excel in nature.

Running the Template

Let’s run the template. A prompt is first presented to the user.

  

The user can enter the keywords and then click OK. The query is run and the reporting engine generates the result as shown below.

  

Notice the formatHighlights(#highlights) macro function was able to take the MongoDB FTS embedded highlights array structure which itself includes other embedded structures and format a paragraph complete with keyword highlighting. The result can also be exported in a variety of formats including PDF as shown below.

  

Other Possibilities

Qarbine can be embedded into enterprise applications which leverage full text search (FTS) and other MongoDB features. In some cases enterprise applications use FTS to obtain a list of MongoDB documents to present to the end user. Using Qarbine’s embedding SDK, these documents can be passed directly to a Qarbine template for its complementary analysis and presentation. There is no need to refetch the data with a redundant query.

The Qarbine HTML results can also be interactive. This allows “callbacks'' from a Qarbine page to the enterprise application based on user interactions. The result is a smooth and fluid user workflow between the 2 worlds. Many other variations are possible such as passing a document identifier for a variable value within a Qarbine Data Source.