In the area of Predictive Analytics, there are many different algorithms that allow you to build a prediction model based on data. Earlier, we discussed other algorithms to calculate probability. Another commonly used algorithm in predictive analytics and AI is a Decision Tree called C4.5.
The term “Decision Tree” is overused across the internet and we’ve even seen this referred to in a “Business Rules” context. Conversely, what we mean by using “Decision Tree” in this article is a model that allows the prediction of cases based on some training of historical data.
Decision Tree is a very effective and fast classifier. It builds a tree model that allows the classification of new cases based on historical information from similar cases. The historical data must be relevant, and the tree should not be overfit.
Once the predictive model is created, new cases can be classified accordingly.
Business Rules Extraction
One of the use-cases for Decision Tree is to create a set of rules from data. This rules extraction will dump down the decision rules that are used for classifying the new case.
if (Outlook == 'sunny') and (Humidity 70) then Play = 'no' if (Outlook == 'overcast') then Play = 'yes' if (Outlook == 'rain') and (Windy == false) then Play = 'yes' if (Outlook == 'rain') and (Windy == true) then Play = 'no'
The advantage of C4.5 over some other methods in AI is that it will enable you to understand why a decision has been taken. Explainability is very important in the context of Audit. You need to be able to explain exactly why and a decision has been make.
In the health care industry, the data can be used for effective analysis and diagnosis of many diseases by several data mining algorithms including C4.5. Because C4.5 is a very fast algorithm, it can be used to build a large amount of data i.e. health care and clinical data.
Other interesting use cases is for example selecting a plant for a particular area, or the seasonal patterns of traffic offences.
FlexRule’s algorithm is based on C4.5 which is part of our Predictive Analytics module, which is an algorithm used to generate the tree developed by Ross Quinlan. C4.5 is an extension of Quinlan’s earlier ID3 algorithm. The trees generated by C4.5 can be used for classification, and for this reason C4.5 is often referred to as a statistical classifier.
Using predictive algorithms like Decision Tree not only allows prediction based on certain situations. It enables you to extract business rules from data, and also gives you a better idea about the effectiveness of your operation based on the current criteria and conditions that constitute the boundary of your existing rules.