Data Exploration is a critical part of making business decisions. There are many tools for data exploration, either open source tools such as R, Python Panda, and so on to coding libraries such as Deedle and LINQ. However, they all miss a critically important business need. The data exploration in the absence of business decisions, business knowledge, and business rules will deepen the friction between technology and business values.

In a series of articles, we will show how a true decision management can empower the business and operation team to start the data exploration with an integrated approach in a unified platform that is aware of the business decisions, knowledge, and business rules.

But before going too deep, let’s cover some basic terminologies first…

What is a Monad?

Monad is a small function with input as its source, that will be transforming the source to an output. The monad function may or may not have some parameters.

MonadQuery

Monad is unique because it enables users to quickly chain multiple of them and express a more complex relation (or behaviour) between inputs and outcomes. Monadic operators are highly composable, and understanding them is still easy as they consciously express complex relations.

Decision-Integrated Monadic Query Language

Decision-Integrated Monadic Query Language or DIMIQ is a data query language that enables users to query the data from one or multiple data sources, regardless of where those data are located and how the data is stored. The data can be in the database, files, in the cloud, on-prem, etc.

The purpose of DIMIQ is to put data query capability close to business knowledge, business decisions and rules rather than sitting in the database or inside applications. Many of the business rules and decisions are highly reliant on data, or they are very data-driven. Therefore, DIMIQ aims to close the gap between the data exploration exercise and applying them to making business decisions.

DIMIQ uses monadic operators and syntax to allow users to build decision-integrated data queries. If you are not familiar with monadic operators, do not worry; we will explain it here very quickly.

Monadic data operators are special types of monads whose input is a data set. A dataset is a collection of data loaded into memory.

monadExample

Monadic data operators apply a specific function or predicate on a set of data one-by-one. Therefore they need to reference an individual record in the dataset with an alias. The predicate expresses a clause where the record in the dataset is influenced by the operator (ie. Monad).

Let’s look at an example of filtering accounts based on Family value:

formulaE1

In the above example, |filter is a monad that filters a dataset called accounts. So, what’s the bases of the filter? The predicate based on the alias r defines the criterion for all the records where the column’s Family is ‘Miller’.

Now that we covered the basics, let’s load two CSV files into our authoring environment and start playing with them.

Load Data

There are many different ways to load any type of data in FlexRule platform. You can load data using the low-code models as part of orchestration with drag and drop, but for fun, let's use our expression language to load them directly into the Interactive Shell.

FormulaE2

As you see in the above example, accounts.csv is loaded as a CSV file by passing it to |toCsv() monad function and then the result of it is stored into parameter variable (or in short variable) called accounts. And we did the same for the other csv file called invoices.csv to load the to invoices variable.

So the data is now loaded, and we are ready to play. This was not hard, was it? 😊

Looking into Data

There are three options to look into any type of data. Simply type print(accounts) and the data will be displayed in the Interactive Shell in a table format:

FormulaE3

Alternatively print can be used as a monadic operator as |print() as well.

Or you can look at the data in more advanced form in the Watch Window

watch-window

Or use a Data Viewer that allows you to

  • build filters
  • create rules
  • conditional highlighting
  • and many more…
viewerE1

Filter Data

There are rules around filtering in every business operation. Rules like it filtering criteria based on state, time, family amount of debt and so on… Therefore, a very important mondic operator is called |where or |filter that they allow you to filter your data base on some conditions. They both do the same thing, just different naming.

Let’s filter the accounts that have the Family name of ‘Miller’ and print it:

FormulaE4

Group Data

Grouping data allows you to split data into multiple groups based on some criteria, simple or composite values.

FormulaE5

And as you see below, we have groups of collections of accounts based on their Family:

viewerE2

The group split the accounts to a set of new lists based on the value of the Family. For instance, the list with Key of ‘Miller’ has a list of two accounts. But the group with Key of ‘Brehmer’ has only one account.

Transformation

There are many built-in functions in FlexRule that allows you to apply them on top of a dataset to transform the values of one column to another. So to doing it, we can use |apply monadic operator.

Let’s say we want to add a new column called Age that has the Age of the accounts. But the DOB column is a string. So we need to:

  1. Convert DOB to Date format
  2. Calculate the Age based on DOB and today date

So we can write an expression like the one below that creates the ‘Age’ column

FormulaE6

And as you see below, we created an Age column which is calculated based on ‘DOB’ value:

viewerE3

Let’s unpack the above expression:

FormulaE7

The above expression will calculate the Age based on the number of years between now and the person's birthday. The monad |asInt() will convert a decimal value of the dateDiff function to a simple integer value with no decimal points. And then in the apply monad we assign the value to an attribute named Age as below:

FormulaE8

Book a Custom Demo

First or last name is too short





Conclusion

Data exploration platforms such as Python Panda, R and others deepen the fraction between business rules, decisions and business knowledge used in the operation team. These teams require a technique that allows them to look at the data, understand it and apply the business rules on top of it.

In the next article, we will discuss some more interesting decision-integrated data query expressions to join multiple data sources and calculate the sum of overdue amounts for each month. Finally, to pass the data with overdue amounts to a decision to calculate the late payment of the accounts based on some business rules.

Last updated October 31st, 2022 at 03:42 pm, Published October 28th, 2022 at 03:42 pm