ISSN: 0304-128X ISSN: 2233-9558
Copyright © 2024 KICHE. All rights reserved

Articles & Issues

Language
korean
Conflict of Interest
In relation to this article, we declare that there is no conflict of interest.
Publication history
Received May 27, 2024
Revised June 11, 2024
Accepted June 11, 2024
articles This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/bync/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright © KIChE. All rights reserved.

All issues

DNA 길이와 혼합 종 개수 예측을 위한 합성곱 신경망

Convolution Neural Network for Prediction of DNA Length and Number of Species

제주대학교 화학공학과
Department of Chemical Engineering, Jeju National University
fluid@jejunu.ac.kr
Korean Chemical Engineering Research, August 2024, 62(3), 274-280(7), https://doi.org/10.9713/kcer.2024.62.3.274 Epub 1 August 2023

Abstract

기계학습법의 신경망 기술을 이용한 자료분석은 질병 유전자 탐색 및 진단, 신약 개발, 약인성 간 손상 예측 등과 같

은 다양한 분야에서 활용되고 있다. 질병 특징 발견을 위한 자료분석은 DNA 정보를 기반으로 이루어질 수 있다. 본

연구에서는 DNA의 분자 정보 중 DNA의 길이와 용액 내 DNA의 길이별 종 개수를 예측하는 신경망을 개발하였다.

겔 전기영동을 통한 기존 방법론의 시간 소요 한계점을 해결하고자, 미세유체역학적 농축 장치의 동역학 자료를 분석

대상으로 하여 실험 분석 과정 중의 시간 소요 문제점을 해결하였다. 동역학 자료를 공간시간 지도로 재구성하여 학습

및 예측에 필요한 계산용량을 낮추었으며, 공간시간 지도에 대한 분석 정확도를 높이기 위해 합성곱 신경망을 활용하

였다. 그 결과, 단일 변수 회귀로써의 단일 DNA 길이 예측과 복합 변수 회귀로써의 다종 DNA 길이의 동시 예측 및

이진 분류로써의 DNA 혼합 종 개수 예측을 성공적으로 수행하였다. 추가적으로, 예측 과정 중 발생할 수 있는 예측

편향을 학습 자료 구성 방식을 통한 해결책을 제시하였다. 본 연구를 활용한다면, 광학 측정 자료를 이용하는 액체생

검 기반의 세포유리 DNA 분석 및 암 진단 등의 의학 자료 분석을 효과적으로 수행할 수 있을 것이다.

Machine learning techniques utilizing neural networks have been employed in various fields such as disease

gene discovery and diagnosis, drug development, and prediction of drug-induced liver injury. Disease features can be

investigated by molecular information of DNA. In this study, we developed a neural network to predict the length of DNA

and the number of DNA species in mixture solution which are representative molecular information of DNA. In order to

address the time-consuming limitations of gel electrophoresis as conventional analysis, we analyzed the dynamic data of a

microfluidic concentrating device. The dynamic data were reconstructed into a spatiotemporal map, which reduced the

computational cost required for training and prediction. We employed a convolutional neural network to enhance the

accuracy to analyze the spatiotemporal map. As a result, we successfully performed single DNA length prediction as

single-variable regression, simultaneous prediction of multiple DNA lengths as multivariable regression, and prediction of

the number of DNA species in mixture as binary classification. Additionally, based on the composition of training data,

we proposed a solution to resolve the problem of prediction bias. By utilizing this study, it would be effectively performed

that medical diagnosis using optical measurement such as liquid biopsy of cell-free DNA, cancer diagnosis, etc.

References

1. Mak, K.-K. and Pichika, M. R., “Artificial Intelligence in Drug
Development: Present Status and Future Prospects,” Drug Discovery
Today, 24(3), 773-780(2019).
2. Wuethrich, A. and Quirino, J. P., “A Decade of Microchip Electrophoresis
for Clinical Diagnostics – A Review of 2008–2017,”
Analytica Chimica Acta, 1045, 42-66(2019).
3. Goodfellow, I., Bengio, Y. and Courville, A., Deep Learning,
MIT Press(2016).
4. Pascanu, R., Mikolov, T. and Bengio, Y., “On the Difficulty of
Training Recurrent Neural Networks,” Proceedings of the 30th
International Conference on Machine Learning (2013).
5. Robertson, R. M., Laib, S. and Smith, D. E., “Diffusion of Isolated
DNA Molecules: Dependence on Length and Topology,”
Proc. Natl. Acad. Sci. U.S.A., 103(19), 7310-7314(2006).
6. Salieb-Beugelaar, G. B., Dorfman, K. D., van den Berg, A. and
Eijkel, J. C. T., “Electrophoretic Separation of DNA in Gels and
Nanostructures,” Lab Chip, 9(17), 2508-2523(2009).
7. Gupta, A., Kounovsky-Shafer, K., Ravindran, P. and Schwartz,
D., “Optical Mapping and Nanocoding Approaches to Whole-genome
Analysis,” Microfluidics and Nanofluidics, 20(2016).
8. Bird, R. B., Stewart, W. E. and Lightfoot, E. N., Transport Phenomena,
Wiley(2007).
9. Ghosh, A. and Bansal, M., “A Glossary of DNA Structures from
A to Z,” Acta Crystallographica Section D, 59(4), 620-626(2003).
10. Stellwagen, N. C., Gelfi, C. and Righetti, P. G., “The Free Solution
Mobility of DNA,” Biopolymers, 42(6), 687-703(1997).
11. Won, J.-I., “Recent Advances in DNA Sequencing by End-Labeled
Free-Solution Electrophoresis (ELFSE),” Biotechnology and Bioprocess
Engineering, 11(3), 179-186(2006).
12. Lee, H., “Analysis of Preconcentration Dynamics inside Dead-end
Microchannel,” Korean Chem. Eng. Res., 61(1), 155-161(2023).
13. Dydek, E. V. and Bazant, M. Z., “Nonlinear Dynamics of Ion
Concentration Polarization in Porous Media: The Leaky Membrane
Model,” AIChE Journal, 59(9), 3539-3555(2013).
14. Yap, K. K., Fukuda, K., Vail, J. R., Wong, J. and Masen, M. A.,
“Spatiotemporal Mapping for in-situ and Real-time Tribological
Analysis in Polymer-metal Contacts,” Tribology International,
171, 107533(2022).
15. Kim, S. J., Song, Y.-A. and Han, J., “Nanofluidic Concentration
Devices for Biomolecules Utilizing Ion Concentration Polarization:
Theory, Fabrication, and Applications,” Chem. Sov. Rev., 39(3),
912-922(2010).
16. Choi, J., Huh, K., Moon, D. J., Lee, H., Son, S. Y., Kim, K., Kim, H.
C., Chae, J.-H., Sung, G. Y., Kim, H.-Y., Hong, J. W. and Kim,
S. J., “Selective Preconcentration and Online Collection of Charged
Molecules Using Ion Concentration Polarization,” RSC Adv.,
5(81), 66178-66184(2015).
17. Lee, H., Choi, J., Jeong, E., Baek, S., Kim, H. C., Chae, J.-H.,
Koh, Y., Seo, S. W., Kim, J.-S. and Kim, S. J., “dCas9-mediated
Nanoelectrokinetic Direct Detection of Target Gene for Liquid
Biopsy,” Nano Lett., 18(12), 7642-7650(2018).
18. Brunton, S. L., Noack, B. R. and Koumoutsakos, P., “Machine
Learning for Fluid Mechanics,” Annu. Rev. Fluid Mech., 52(1),
477-508(2020).
19. Mendez, M. A., Ianiro, A., Noack, B. R. and Brunton, S. L.,
Data-Driven Fluid Mechanics: Combining First Principles and
Machine Learning, Cambridge University Press, Cambridge (2023).
20. Alcaide, M., Cheung, M., Hillman, J., Rassekh, S. R., Deyell, R.
J., Batist, G., Karsan, A., Wyatt, A. W., Johnson, N., Scott, D. W. and
Morin, R. D., “Evaluating the Quantity, Quality and Size Distribution
of Cell-free DNA by Multiplex Droplet Digital PCR,”
Sci. Rep., 10(1), 12564(2020).

The Korean Institute of Chemical Engineers. F5, 119, Anam-ro, Seongbuk-gu, 233 Spring Street Seoul 02856, South Korea.
Phone No. +82-2-458-3078FAX No. +82-507-804-0669E-mail : kiche@kiche.or.kr

Copyright (C) KICHE.all rights reserved.

- Korean Chemical Engineering Research 상단으로