Purpose: A careful examination of Ladder Logic program reveals it’s hierarchical
nature and components, which makes it interpret able. Perhaps, using data mining
techniques to interpret the top level and sub level components can be really useful.
This seems like a classification problem. The application of machine learning algorithms
on the features extracted from ladder logic can give insights of the whole ladder
logic code. The components and their interactions are intuitive which certainly add
to the probability of getting better results. The learning in the PLC programming
can be alleviated by converting the existing PLC code to commonly used formats
such as JSON, XML etc. The goal is to consume ladder logic sample, break it down
into minor components, identify the components and interactions between them and
later write them to JSON. As Ladder logic is the most commonly used programming
language for PLCs we decided to start the experiment with Ladder logic program samples.
Feature engineering combined with machine learning techniques should provide
accurate results for the Ladder Logic data.
Methods: The data set contains 6623 records for top level classification with 1421
ALARM, 150 STEP SEQUENCE, 96 SOLENOID samples and 5304 UNCLASSIFIED
and 84472 records for sub level classification which contains sub level components
of the all the ALARM and STEP SEQUENCE samples from the top level data
set. We extract the initial top level and sub level features from GX works. The
advanced features like Sequence, LATCH and comments are extracted by parsing
the information from the output of GX works. The final set of features for Top
level classification consists of basic features, advanced features, and comments. Data
set for Sub level classification has few more features apart from the already existing
features from the top level such as previous instruction - next instruction features(3
Window), bi-gram features of the instructions and top level class( the result of top
level classification). The result of top level classification and sub level classification
are filled into a JSON object which is later written to a JSON file.
Results: We have classification results from the top level and sub level. Decision
trees seem to work the best for both Top Level and Sub level classifications. Since
the features are discrete we tried Decision trees, Naive Bayes and Support vector
machines. Performance result of each of them was: Decision Tree : Accuracy- 0.91
, F1-macro- 0.90, F1-micro 0.90 Naive Bayes : Accuracy- 0.85 , F1-macro- 0.80,
F1-micro 0.85 LinearSVC : Accuracy- 0.88 , F1-macro- 0.88, F1-micro 0.88
For Sub level classification decision trees out perform any of the other classifiers
Decision Tree : Accuracy- 0.90 , F1-macro- 0.91, F1-micro 0.90 Naive Bayes :
Accuracy- 0.80 , F1-macro- 0.80, F1-micro 0.81 LinearSVC : Accuracy- 0.75 , F1-
macro- 0.78, F1-micro 0.79
Conclusions: A decision tree is the best suitable classifier for the purpose here. The
tuning of Decision tree classifier played important role in improving the performance.
Using entropy as the classification criteria and restricted the depth of the tree to 6
improved the performance of the classifier by 6