- 无标题文档
查看论文信息

题名:

 基于RRNS码的DNA 存储通信系统的仿真与实现    

姓名:

 钟泳林    

论文语种:

 chi    

学科代码:

 085202    

门类名称:

 专业学位    

一级学科名称:

 工程    

专业名称:

 光学工程    

培养层次:

 硕士    

学位类型:

 专业学位    

作者国别:

 中国    

学位授予单位:

 华南师范大学    

院系:

 015物理与电信工程学院    

第一导师姓名:

 唐志列    

第一导师单位:

 物理与电信工程学院    

第二导师姓名:

 穆丽伟    

第二导师单位:

 物理与电信工程学院    

论文提交日期:

 2022-06-03    

论文答辩日期:

 2022-05-25    

学位授予日期:

 2022-06-24    

外文题名:

 SIMULATION AND IMPLEMENTATION OF DNA STORAGE COMMUNICATION SYSTEM BASED ON RRNS CODE    

关键词:

 DNA 存储通信系统 ; RRNS 码 ; marker 码 ; 多序列比对    

外文关键词:

 DNA storage communication system; RRNS code; marker code; multiple sequence alignment    

论文摘要:

随着计算机技术的快速发展,数字化信息存储正在改变着我们的生活。信息正在以越来越快的速度产生着,与此同时也产生一系列的问题,尤其是如何有效存储数据的问题。很多传统的存储介质比如磁盘、硬盘、闪存等已经逐渐满足不了全世界范围内数据存储的需要。携带有遗传信息的脱氧核糖核酸(DNA),具有密度高、维护成本低以及保存时间长等特点,在数据存储需求呈指数级增长的时代背景下,DNA是有希望成为取代传统存储设备的潜在存储介质,因此DNA数据存储技术也逐渐成为研究热点。

DNA编码是DNA存储中的一项关键技术,它的编码效果直接影响DNA存储性能的优劣和数据读写的完整。DNA编码能用尽可能少的碱基序列无错地存储数据信息,包括压缩(尽可能少的占用空间)、纠错(无错存储)和转换(数字信息转为碱基序列)3部分。目前由于生物技术的限制, DNA存储过程中不可避免地引入一些错误,纠错是DNA存储所要面临的一个重要问题。

本文主要研究了RRNS码在DNA存储通信系统上的应用与仿真性能。首先介绍了本领域研究背景和国内外的研究现状,继而详细地介绍了DNA存储所面临的问题与挑战;接着根据目前DNA存储存在的问题引出相对应的解决方案,同时介绍本文所用到的编码技术,包括RRNS码、marker码、BCH码以及多序列比对的方法;然后对本文建模的DNA存储通信信道进行性能分析;最后进行仿真实验和结果分析。主要工作包括:

(1)提出了具有纠删功能的RRNS码的译码算法。根据RRNS码自身独有的并行性和独立性的性质,本文提出具有纠删功能的RRNS码的译码算法,让缺失的DNA存储信息准确可靠地恢复、降低存储结果的误码率的同时,也降低RRNS译码的复杂度,增加解码效率。该算法克服了DNA存储通信系统中存在的插入和删除的错误,保证信息存储的可靠性。

(2)建立了DNA存储通信信道的数学模型。基于目前的DNA测序技术的特点,本文分别建立纳米孔测序仪DNA存储通信信道和ILLUMINA测序仪DNA存储通信信道,用以模拟现实生活中DNA存储的过程,并且分别对这两个DNA存储通信模型进行性能分析和信道容量的分析。

(3)实现了DNA存储通信信道系统的仿真实验。基于两个DNA存储通信信道数学模型,采用RRNS码与多序列比对的方法和RRNS码与marker码的方法(marker码作为内码,能根据插入的标记位的位置移动来判断比特的损失或增益,随后采用RRNS码进行纠错)进行编程应用。然后通过进行一系列相对应的仿真实验,得到相对应的误码率性能实验曲线,进行实验结果的分析。实验结果证明在两个DNA存储通信信道中采用RRNS码与多序列比对的方法能通过合成大量DNA片段来保持高可靠性地恢复信息;RRNS码与marker码作为内外码方法得益于采用具有纠删功能的RRNS码的译码算法,比RRNS码与多序列比对的方法仿真性能好,在不需要合成大量DNA片段的条件下仍然可以可靠地恢复缺失的DNA片段,从而使存储信息能够恢复,取得良好的仿真效果。

外文摘要:

With the rapid development of computer technology, digital information storage is changing our life. Information is being produced at a faster and faster speed, and at the same time, a series of problems have arisen, especially the problem of how to store data effectively. Many traditional storage media, such as disk, hard disk, flash memory, gradually failed to meet the needs of worldwide data storage. DNA, which carries genetic information, has the characteristics of high density, low maintenance cost and long storage time. Under the background of the exponential growth of data storage demand, DNA is expected to become a potential storage medium to replace traditional storage devices, and as a result, DNA data storage technology has gradually become a research hotspot.

DNA encoding is a key technology in DNA storage, and its result directly affects the performance of DNA storage and the integrity of data reading and writing. DNA encoding is to use as few base sequences as possible to store data information without errors, including compression (taking up as little space as possible), error correction (error-free storage) and conversion (converting digital information into base sequences). Due to the limitation of current biotechnology, it is inevitable to introduce some errors in the process of DNA storage, and error correction is an important problem that DNA storage faces at present.

The application and simulation performance of RRNS code in DNA storage and communication system were mainly studied in this paper. Firstly, the research background of in this field and the research status at home and abroad were introduced, and then the problems and challenges faced by DNA storage were introduced in detail. Then, the corresponding solution was proposed according to the current problems in DNA storage, and the coding techniques used in this paper were introduced, including RRNS code, marker code, BCH code and multi-sequence comparison method. Moreover, the performance analysis of the DNA storage communication channel modeled in this paper was performed, and finally the simulation experiment and the result analysis were performed. The main work includes:

(1) The decoding algorithm of RRNS code with correction function was proposed. Based on the unique parallelism and independent nature of RRNS codes themselves, this paper proposed the decoding algorithm of RRNS codes with corrective deletion function, which allowed the accurate and reliable recovery of missing DNA storage information and reduced the bit error rate of storage results. At the same time, this algorithm could reduce the complexity of RRNS decoding and increasing the decoding efficiency. The algorithm overcame the errors of insertion and deletion in DNA storage communication systems and ensures the reliability of information storage.

(2) The mathematical model of DNA storage communication channel was developed. Based on the characteristics of current DNA sequencing technology, the nanopore sequencer DNA storage communication channel and ILLUMINA sequencer DNA storage communication channel were established in this paper to simulate the real-life DNA storage process, and the performances and the channel capacity of these two DNA storage communication models were analyzed respectively.

(3) The modeling and simulation experiment of DNA storage channel system were realized. Based on the mathematical model of two DNA storage communication channels, the method of RRNS code with multiple sequence alignment and RRNS code with marker code (marker code as an internal code that can determine the loss or gain of bits based on the movement of the position of the inserted marker bits, followed by error correction using RRNS code) are used for programming applications. Then a series of corresponding simulation experiments are conducted to obtain the corresponding experimental curve of Bit Error Rate performance, and the experimental results were analyzed. The experimental results proved that the method of RRNS code with multiple sequence alignment in two DNA storage communication channels could restore information with high reliability by synthesizing a large number of DNA fragments. And the method of RRNS code and marker code could benefit from the decoding algorithm of RRNS code with correction function, which has better simulation performance than the method of RRNS code with multiple sequence alignment, and it could reliably recover the missing DNA fragments without synthesizing a large number of DNA fragments, thus recovering the stored information and achieving good simulation results.

论文总页数:

 64    

参考文献总数:

 77    

资源类型:

 学位论文    

开放日期:

 2022-06-07    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式