Articles & Issues

Language: English

Conflict of Interest: In relation to this article, we declare that there is no conflict of interest.

Publication history: Received May 22, 2023
Revised June 11, 2023
Accepted June 22, 2023

Acknowledgements: This research was supported by the Chung-Ang University Graduate Research Scholarship in 2021 and this research was supported by the H2KOREA funded by the Ministry of Education. Also, it was supported by the Human Resources Development (No. 20214000000280) of the Korea Institute of Energy Technology Evaluation and Planning (KETEP) grant funded by the Korea government Ministry of Trade, Industry and Energy

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/bync/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

All issues

Automatic anomaly detection in engineering diagrams using machine learning

Ho-Jin Shin¹ Ga-Young Lee¹ Chul-Jin Lee^{1 2†}

¹School of Chemical Engineering and Materials Science, Chung-Ang University, 84, Heukseok-ro, Dongjak-gu, Seoul 06974, Korea ²Department of Intelligent Energy and Industry, Chung-Ang University, 84, Heukseok-ro, Dongjak-gu, Seoul 06974, Korea

cjlee@cau.ac.kr

Korean Journal of Chemical Engineering, November 2023, 40(11), 2612-2623(12)
https://doi.org/10.1007/s11814-023-1518-8

Download PDF

Abstract

This study implements a method of automating anomaly detection in engineering diagrams by extractingpatterns within graphs after recognizing graphs from a piping and instrumentation diagram (P&ID). The frameworkconsists of three parts: graph generation, subgraph extraction, and graph classification. Graphs are generated throughsymbol recognition and line recognition, and subgraphs are extracted using the frequent subgrap mining algorithm.The graph classification targets are divided into two categories according to the frequency of the main equipment ofthe extracte subgraph. If the frequency is low, it is classified through whether to include a user-defined subgraph, andif it is high, it is trained in a support vector machine (SVM) algorithm after vector embedding to generate a classification model. K-fold cross-validation is also applied to increase classificatio accuracy. The proposed framework shows85% accuracy for a given test drawing through cross-validation. These outcomes contribute to the field of engineeringdiagram analysis and have potential applications in plant industries

Keywords

Engineering Diagram Objective Detection Graph Pattern Mining Support Vector Machine Piping and Instrumentation Diagram

References

1. W. I. Strunk and E. B. White, The elements of style, Pearson Publications, New York, 88 (1979).
2. S. U. Rehman and A. U. Khan, IEEE., In Seventh International Conference on Digital Information Management (ICDIM), Graph mining: A survey of graph mining techniques, 88 (2012)
3. N. Otsu, IEEE., A threshold selection method from gray-level histograms, 9, 62 (1979).
4. J. Sauvol and M. Pietikäinen, Pattern Recognition, Adaptive document image binarization, 33, 225 (2000).
5. D. M. Himmelblau, Korean J. Chem. Eng., Applications of artificial neural networks in chemical engineering, 17, 373 (2000).
6. C. Szegedy and W. Liu, In Proceedings of the IEEE conference on computer vision and pattern recognition, Going deeper with convolutions, 1 (2015).
7. J. Redmon and S. Divvala, In Proceedings of the IEEE conference on computer vision and pattern recognition, You only look once: Unified, real-time object detection, 779 (2016).
8. W. Liu and D. Anguelov, In Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, Ssd: Single shot multibox detector, 21 (2016).
9. R. Girshic, J. Donahue and T. Darrell, In Proceedings of the IEEE conference on computer vision and pattern recognition, Rich feature ierarchies for accurate object detection and semantic segmentation, 580 (2014).
10. R. Rahul, S. Paliwal and M. Sharma, arXiv preprint arXiv:1901., Automatic information extraction from piping and instrumentation diagrams, 11383 (2019).
11. K. Simonyan and A. Zisserman, arXiv preprint arXiv:1409., Very deep convolutional networks for large-scale image recognition,1556 (2014).
12. J. Long, E. Shelhamer and T. Darrell, InProceedings of the IEEE conference on computer vision and pattern recognition, Fully convolutional networks for semantic segmentation, 3431 (2015).
13. P. V. Hough, Method and means for recognizing complex patterns. U.S. Patent 3,069,654 (1962).
14. J. Canny, IEEE Transactions on pattern analysis and machine intelligence, A computational approach to edge detection, 6, 679 (1986).
15. S. Oh, M. Chae, H. Lee, Y. Lee, E. Jeong and H. Lee, Plant Jour
nal, A Study on the Improved Line Detection Method for Pipeline Recognition of P&ID, 16, 33 (2020). 16. S. Agarwal, In 2013 international conference on machine intelligence and research advancement, Data mining: Data mining concepts and techniques, 203 (2013).
17. C. C. Aggarwal and H. Wang, Managing and mining graph data, Graph data management and mining: A survey of algorithms and applications, 13 (2010).
18. J. W Raymond, E. J. Gardiner and P. Willett, J. Chem. Inf. Comput. Sci, Heuristics for similarity searching of chemical graphs using a maximum common edge subgraph algorithm, 40, 13 (2002).
19. X. Yan and J. Han, In 2002 IEEE International Conference on Data Mining, gspan: Graph-based substructure pattern mining, 721 (2002). 20. J. Huan, W. Wang and J. Prins, In Third IEEE international conference on data mining, Efficient mining of frequent subgraphs in the presence of isomorphism, 549 (2003).
21. S. Nijssen and J. N. Kok, Electronic Notes in Theoretical Computer Science, The gaston tool for frequent subgraph mining, 127, 77 (2005).
22. M. Wörlein, T. Meinl, I. Fischer and M. Philippsen, In Knowledge Discovery in Databases: PKDD 2005: 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, A quantitative comparison of the subgraph miners MoFa, gSpan, FFSM, and Gaston, 392 (2005).
23. N. Kiryati, Y. Eldar and A. M. Bruckstein, Pattern Recognition, A probabilistic Hough transform, 24, 303 (1991).
24. Y. Baek, B. Lee, D. Han, S. Yun and H. Lee, In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Character region awareness for text detection, 9365 (2019).
25. R. Smith, In Ninth international conference on document analysis and recognition (ICDAR), An overview of the Tesseract OCR
engine, 2, 629 (2007).
26. M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt and B. Scholkopf, IEEE Intelligent Systems and their applications, Support vector machines, 13, 18 (1998).
27. A. Narayanan, M. Chandramohan, R. Venkatesan, L. Chen, Y. Liu and S. Jaiswal, arXiv preprint arXiv:1707.05005, graph2vec: Learning distributed representations of graphs (2017).
28. Technical Committee ISO/TC 27, Graphical symbols for use on mechanical engineering and construction drawings, diagrams, plans, maps and in relevant technical product documentation, ISO 14617-14:200 Publications (2004).
29. Symbols Instrumentation, International Society of Automation, Instrumentation Symbols and Identification ANSI/ISA-5.1 (2009).
30. T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan and S. Belongie, In Proceedings of the IEEE conference on computer vision and pattern recognition, Feature pyramid networks for object detection, 936 (2017).
31. H. Saigo, S. Nowozin, T. Kadowaki, T. Kudo and K. Tsuda, Machine Learning, gBoost: a mathematical programming approach to graph classification and regression, 69 (2009).
32. M. Thoma, H. Cheng, A. Gretton, J. Han, H. P. Kriegel, A. Smola, L. Song, P. S. Yu, X. Yan and K. Borgwardt, In Proceedings of the 2009 SIAM International Conference on Data Mining, Near-optimal supervised feature selection among frequent subgraphs, 10 (2009).
33. R. Hu, X. Zhu, Y. Zhu and J. Gan, World Wide Web, Robust SVM with adaptive graph learning, 23, 1945 (2020).
34. J. M. Spoor, J. Weber and J. Ovtcharova, In 2022 8th International Conference on Control, Decision and Information Technologies
(CoDIT), A Definition of Anomalies, Measurements, and Predictions in Dynamical Engineering Systems fo